generative learning methods for bags of features model the probability of a bag of features given a...

Generative learning methods for bags of features

• Model the probability of a bag of features given a class

Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Generative methods

• We will cover two models, both inspired by text document analysis:• Naïve Bayes• Probabilistic Latent Semantic Analysis

The Naïve Bayes modelThe Naïve Bayes model

Csurka et al. 2004

• Assume that each feature is conditionally independent given the class

fi: ith feature in the imageN: number of features in the image

iiN cfpcffp

11 )|()|,,(

Csurka et al. 2004

jcwpcfpcffp1

11 )|()|()|,,(

wj: jth visual word in the vocabularyM: size of visual vocabularyn(wj): number of features of type wj in the image

fi: ith feature in the imageN: number of features in the image

Csurka et al. 2004

No. of features of type wj in training images of class c

Total no. of features in training images of class c

p(wj | c) =

jcwpcfpcffp1

11 )|()|()|,,(

Csurka et al. 2004

No. of features of type wj in training images of class c + 1

Total no. of features in training images of class c + M

p(wj | c) =

(Laplace smoothing to avoid zero counts)

jcwpcfpcffp1

11 )|()|()|,,(

cwpwncp

cwpcpc j

)|(log)()(logmaxarg

)|()(maxarg*

Csurka et al. 2004

• Maximum A Posteriori decision:

(you should compute the log of the likelihood instead of the likelihood itself in order to avoid underflow)

Csurka et al. 2004

• “Graphical model”:

Probabilistic Latent Semantic AnalysisProbabilistic Latent Semantic Analysis

T. Hofmann, Probabilistic Latent Semantic Analysis, UAI 1999

zebra grass treeImage

= p1 + p2 + p3

“visual topics”

• Unsupervised technique• Two-level generative model: a document is a

mixture of topics, and each topic has its own characteristic word distribution

document topic wordP(z|d) P(w|z)

• Unsupervised technique• Two-level generative model: a document is a

mixture of topics, and each topic has its own characteristic word distribution

kjkkiji dzpzwpdwp

)|()|()|(

The pLSA modelThe pLSA model

kjkkiji dzpzwpdwp

)|()|()|(

Probability of word i in document j

(known)

Probability of word i given

topic k (unknown)

Probability oftopic k givendocument j(unknown)

The pLSA modelThe pLSA model

Observed codeword distributions

(M×N)

Codeword distributionsper topic (class)

(M×K)

Class distributionsper image

(K×N)

kjkkiji dzpzwpdwp

)|()|()|(

p(wi|dj) p(wi|zk)

p(zk|dj)

documents

topics

documents

Maximize likelihood of data:

Observed counts of word i in document j

M … number of codewords

N … number of images

Slide credit: Josef Sivic

Learning pLSA parametersLearning pLSA parameters

InferenceInference

)|(maxarg dzpzz

• Finding the most likely topic (class) for an image:

InferenceInference

)|(maxarg dzpzz

• Finding the most likely topic (class) for an image:

zzz dzpzwp

dzpzwpdwzpz

)|()|(

)|()|(maxarg),|(maxarg

• Finding the most likely topic (class) for a visual word in a given image:

Topic discovery in images

J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005

Application of pLSA: Action recognition

Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.

Space-time interest points

Application of pLSA: Action recognition

Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.

pLSA model

– wi = spatial-temporal word

– dj = video

– n(wi, dj) = co-occurrence table (# of occurrences of word wi in video dj)

– z = topic, corresponding to an action

kjkkiji dzpzwpdwp

)|()|()|(

Probability of word i in video j(known)

Probability of word i given

topic k (unknown)

Probability oftopic k given

video j(unknown)

Action recognition example

Multiple Actions

Summary: Generative models

• Naïve Bayes• Unigram models in document analysis• Assumes conditional independence of words given class• Parameter estimation: frequency counting

• Probabilistic Latent Semantic Analysis• Unsupervised technique• Each document is a mixture of topics (image is a mixture of

classes)• Can be thought of as matrix decomposition• Parameter estimation: Expectation-Maximization

generative learning methods for bags of features model the probability of a bag of features given a...

image slide

counts slide

underflow slide

class w j

document j unknown slide

visual topics slide

given image

nave bayes model csurka

Documents

fire station budgeting and cost control - rice fergus...

fergus clancy nhcc presentation

1.3.3 lara fergus

discriminative and generative methods for bags of features...

lecture 7: finding features (part...

introduction - hacettepepinar/courses/vbm686/... · 2018....

one-shot learning of object categories - huji cse ›...

defense-fei fei

c3d: generic features for video analysis - arxiv · c3d:...

object recognition jana kosecka slides from d. lowe, d....

bag-of-features models many slides adapted from fei-fei li,...

beyond bags of features: part-based models many slides...

sift and object recognition - princeton university … and...

aligning audiovisual features for audiovisual speech ... ·...

release notes - avizo for fei systems 9.3.0, december...

arxiv:1505.03873v1 [cs.cv] 14 may 2015kevin tang 1, manohar...

fcv learn fergus

lecture 6: finding features (part...

object recognition: history and...

object recognition: history and overvie · 2008. 9. 9. ·...