cs365:artificial intelligence instructor: dr. amitabha ... · parsing natural scene images using...

17
P ARSING N ATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee Students: Shubham Gupta Vedant Mishra

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

PARSING NATURAL SCENE IMAGES USING RECURSIVE

NEURAL NETWORKS

CS365:Artificial Intelligence

Instructor: Dr. Amitabha Mukerjee

Students: Shubham Gupta

Vedant Mishra

Page 2: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

SUMMARY Motivation

Work done Before

Overview of the process

Image Segmentation and Feature

Extraction

Algorithm For Image Parsing

Results

References

Page 3: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

MOTIVATION This approach is general in nature and

not limited to scene images only.

It not only classifies the scene into

discrete units but also helps in

understanding the way these interact to

form a whole scene.

It has outperformed the state-of-the-art

methods for image classification and

segmentation on the Stanford dataset.

Page 4: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

WORK DONE BEFORE

Main work on which our project would be based is “Parsing Natural Scenes and Natural Language with Recursive Neural Networks” by Richard Socher, Andrew Y. Ng, Christopher D. Manning, Cliff Chiung-Yu Lin

“Decomposing a Scene into Geometric and Semantically Consistent Regions “ by Gould, S., Fulton, R., and Koller, D. also uses merging operations for scene classification.

Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science,313, 2006 used Deep learning to find lower dimensional representations for fixed size input images.

Page 5: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

OVERVIEW OF THE PROCESS

We would be testing the algorithm on some images

of nearby locality.

Extract features

for these

superpixels

Features are

mapped to a new

semantic space

Compute a score

for different

region using RNN

Output the image

showing different

segments

Use these scores

to merge different

segments

Over segment

image into

superpixels

Page 6: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

IMAGE SEGMENTATION AND

FEATURE EXTRACTION 1. Over segment the image using Mean Shift

Algorithm (using [4])

2. Extract Features for each segment (Using [3])

a) Color Histogram

b) Shape Features

c) Area

d) Textures

Page 7: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

ALGORITHM FOR PARSING

IMAGES Input

Set of vector which represent image segment

Adjacency matrix A(i,j) which is 1 if segment i is neighbor of j.

Output

1. The semantic representation if the 2 vectors are merged.

2. Score of how plausible the new node is.

3. Correct Parse Tree

Recursive Neural Network is used for Parsing.

Neural

Network

Score

Input Vectors

Output

Page 8: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

RECURSIVE NEURAL NETWORK

c1

1. p = sigmoid ( W c2 + b ), Using[1]

where sigmoid outputs between 0 and 1 and c1 and c2 are input vectors.

Score = (wT)score p Using [1]

where wT is a parameter vector

Neural Network

Score Parent(p)

c1 c2

Page 9: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

PARSING AN IMAGE

Image from www.socher.org

Page 10: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

RESULTS

Test Image Image after

( Faculty Building) Segmentation

Matrix denoting different

segments of the image

Page 11: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

REFERENCES [1] Main Paper: socher-linCC-NgA-11_parsing-natural-scenes-w-

RNNs

http://nlp.stanford.edu/pubs/SocherLinNgManning_ICML2011.pdf

[2] Video related to Parsing Images: http://techtalks.tv/talks/54422/

[3] Decomposition of scene into geometric regions and semantically

consistent regions

http://robotics.stanford.edu/~koller/Papers/Gould+al:ICCV09.pdf

[4] Comaniciu, D. and Meer, P. Mean shift: a robust approach

toward feature space analysis. IEEE PAMI, 24(5):603–

619, May 2002.

Dataset and source code: [1]nlp.stanford.edu/~socherr/cppFeatures.tar.bz2

[2]http://www.socher.org/index.php/Main/ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks

Page 12: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

THANK

YOU!!

Questions??

Page 13: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

SIGMOID FUNCTION

Map elements to 0 and 1.

S(t) = 1

1 + e-t

Page 14: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

MAX MARGIN FRAMEWORK

This framework is for training the data.

The score of the tree is computed by sum of the

parsing decision scores at each node.

s(xi , yj ) is the score for the correct parse yi

corresponding to the input image xi.

Max margin framework is defined as :

J = ∑ s(xi , yj) – max ( s(xi , y) + ∆(y , yi) )

The loss function ∆ penalizes all incorrect decision.

Page 15: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

LABELLING IN RECURSIVE NEURAL

NETWORK

We can use softmax function for calculating labels in neural network.

• Labelp = softmax ( Wlabel p ) ref [1]

pi = exp(qi)

∑nj=1 exp(qj)

Different label for different segments according to p . For eg

red - Building

blue - Sky

green - Grass

Neural

Network

Input Vectors

Softmax

Layer

BUILDING

Page 16: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

OVERSEGMENTATION

Kernel is a function of feature vector.

Page 17: CS365:Artificial Intelligence Instructor: Dr. Amitabha ... · PARSING NATURAL SCENE IMAGES USING RECURSIVE NEURAL NETWORKS CS365:Artificial Intelligence Instructor: Dr. Amitabha Mukerjee

MAPPING IN THE SEMANTIC SPACE

aj = f( Wsem Fi + bsem )

where ai is the map in semantic space, Fi is the

feature vector and W is the parameter matrix