novelty generation with deep learning

Novelty generation with deep learningPresented by : Cherti Mehdi

joint work with Balázs Kégl and Akın Kazakçı

• Research questions

• Why machine learning and deep learning ?

• Generating new types of objects (C) using previously acquired knowledge (K)

• Evaluating novelty with out-of-class generation metrics

• Perspectives

Roadmap

Summary

• Recently, generative models have gained momentum. But such models are almost exclusively used in a prediction pipeline.

• Our objective is to study

• a) whether such models can be used to generate novelty

• b) how to evaluate their capacity for generating novelty

Research questions

• What is meant by the generation of novelty?

• How can novelty be generated?

• How can a model generating novelty be evaluated?

Why machine learning and deep learning ?

• Knowledge is important : machine learning enable the study of creativity in relation with knowledge

• Generative modeling: we want to generate objects

• Composition of features is important : deep learning models can automatically learn a hierarchy of features of growing abstraction from raw data

I focus my thesis on deep generative models.

Generating new types of objects

In Kazakçı et al. 2016:

• We show that symbols of new types can be generated by carefully tuned autoencoders

• We make a first step of defining the conceptual and experimental framework of novelty generation

Generating new types of objects:autoencoders

• Autoencoders have existed for a long time (Kramer 1991)

• Deep variants are more recent (Hinton, Salakhutdinov, 2006; Bengio 2009)

• A deep autoencoder learns successive transformations that decompose and then recompose a set of training objects

• The depth allows learning a hierarchy of transformations

• Two ways of learning an autoencoder : undercomplete (bottleneck) and overcomplete representation

Slide adapted from from Kazakçı et al. 2016

Generating new types of objects:autoencoders with undercomplete

representation

Reconstruction

Input (dim 625)

Bottleneck

Encode

Decode

Deep autoencoder with a bottleneck from Hinton, G. E., & Salakhutdinov, R. R. (2006).

Generating new types of objects:autoencoders with overcomplete

representation

• Autoencoders can also be learned using an overcomplete representation

• Problem : Risk of learning the identity function

• One solution : constrain the representation to be “simple”

• Example : enforce sparsity of the representation with sparse autoencoders

Generating new types of objects:autoencoders with overcomplete

representation• What does a sparse autoencoder

end up learning ?

• Detect features with the encode function

• Superpose the detected features in the reconstructed image with the decode function

• Benefits of overcomplete representation with sparsity: for each image only a small fraction of features are used but different images use a different subset of features

k is the sparsity rate in %

Figure taken from Makhzani, A., & Frey, B. (2013)

Generating new types of objects:experimental setup

• Training data : MNIST, 70000 images of handwritten digits of size 28x28

• We use a sparse convolutional autoencoder trained to:

• Encode : take an image and transform it to a sparse code

• Decode : take the sparse code and reconstruct the image

• Training objective is to minimize the reconstruction error


Generating new types of objects:generating new symbols

• We use an iterative method to build symbols the net has never seen (inspired by Bengio et al. (2013) but we don’t try to avoid spurious samples):

• Start with a random image

• force the network to construct (i.e. interpret)

• , until convergence, f(x) = decode(encode(x))Slide adapted from from Kazakçı et al. 2016

Generating new types of objects:generating new symbols

• What does the iterative generation procedure do ?

• It’s a non-linear path on the input space defined by the autoencoder (encode + decode) function

• It converges to fixed points defined by the autoencoder

Figure taken from Alain and Bengio (2013)

Generating new types of objects:Visualization of the structure of generated

images

• Colored clusters are original digits (classes from 0 to 9)

• The gray dots are newly generated objects

• New objects form new clusters

• Using a clustering algorithm, we recover coherent sets of new symbols


Generating new types of objects

In Kazakçı et al. 2016:

• We show that symbols of new types can be generated by carefully tuned autoencoders

• We make a first step of defining a conceptual and experimental framework of novelty generation

• However, we make no attempt to design evaluation metrics

A set of types (clusters) discovered by the model

Evaluating novelty

In “Out-of-class novelty generation: an experimental foundation” :

• We design an experimental framework based on hold-out classes

• We review and analyze the most common evaluation techniques from the point of view of measuring “out-of-distribution novelty” and propose new ones

• We run a large-scale experimentation to study the capacity for generating novelty of a wide set of generative models

Evaluating noveltyExperimental framework

• We contrast two main concepts : in-class and out-of-class generation

• in-class generation: can a model re-generate the types already seen in the dataset ? (traditional objective)

• out-of-class generation : can a model generate an unseen (hold-out) set of types ? (a proxy to measure the capacity of a model to generate novelty)

• setup : we train models on a set of types(in), we seek for models that generate a hold-out set of types(out)

Evaluating noveltyEvaluation metrics

• In our experiments:

• We train models on digits

• We seek for models that generate letters

in-class:

out-of-class:

• We pre-train a

• digit classifier (0 to 9)

• a letter classifier (a to z)

• a classifier on a mixture of digits and letters

• Our evaluation metrics report a score for a set of generated objects by a model


Given a set of images, out-of-class objectness is high if:

• the letter classifier is highly confident for each image being one of the letters (a to z)

• we define in-class objectness similarly but using the digit classifier


Given a set of images, out-of-class max and out-of-class count are high if:

• the mixture of digits and letters classifier is highly confident for each image being one of the letters (a to z)

• we define in-class max and in-class count similarly but for digits


• We do a large scale experiment where we train ~1000 models by varying their parameters

• from each model, we generate 1000 images, then we evaluate the model using our proposed metrics

• We collect a total of ~1.000.000 generated images

Experiments

• We evaluate the evaluators with human assessment

• We build an annotation tool to check whether the models selected by our evaluation metrics are effectively good

ExperimentsEvaluating the evaluators

• we found that selecting models for in-class generation will make them memorize the classes they are trained to sample from

• we did succeed to find models which lead to out-of-class novelty

• Pangram obtained from the above model:

ExperimentsResults

ExperimentsResults

• The main focus was setting up the experimental pipeline and to analyze various quality metrics, designed to measure out-of-distribution novelty

• The immediate next goal is to analyze the models in a systematic way

Perspectives

Thank you for listening!

novelty generation with deep learning

Science