a structured learning framework for content- based image indexing and visual query (joo-hwee, jesse...

28
A structured learning framework for content-based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Post on 22-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

A structured learning framework for content-based image indexing and visual Query

(Joo-Hwee, Jesse S. Jin)

Presentation By:Salman Ahmad (270279)

Page 2: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Introduction

• Motivation– To do content based image retrieval from non-

specific images in a broad domain.

Page 3: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Literature Review

• Semantic labeling approach by Town & Sinclair into 11 visual categories by ANN.

• Monotonic tree Approach for classifying image into semantic regions (8 in total).

• Associating image with words but not scalable for image with diverse content.

Page 4: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Image Retrieval Cycle

User images

Feature extraction

Feature & Image storage

User

Query term comparison with features

Result

Storage Retrieval

Page 5: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Semantic Gap

• Semantic Extraction– Requires object recognition & scene

understanding– Monotonic tree

• Semantic Interpretation– Pre-query (manual annotation), query, post

query (relevance feedback)

Page 6: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Semantic Gap

Image

Low level feature extraction

Object recognition

User search term & requirement

Page 7: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Structured learning for Image Indexing

• Based on SSR – salient image patches that exhibit semantic meaning.

• SSR are learned a priori and detected during image indexing.

• No region segmentation step is required.

• Image indexing onto the classification space spanned by semantic label.

Page 8: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Semantic Support Region (SSR)

• Introduced to address the issue of high content diversity

• Modular view based object detector

• Generate spatial semantic signature

• Similarity based and fuzzy logic based query processing.

• Not restricted to the main area of attention in image.

Page 9: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Semantic Support Region (SSR)

Face Figure Crowd Skin

FlowerGreen FarBranch

CloudyClear FloorBlue

sand

Rocky

PoolGrass

FarCity

wall

Old

wooden

Pond

china

River

Fabric Light

Page 10: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Semantic Support Regions (SSR)

• SSR are segmentation free image region.

• Have semantic meanings.

• Detected from tessellated image blocks

• Reconciled across multiple resolution

• Aggregated spatially

Page 11: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

SSR learning

• Use of Support Vector Machines• Features employed

– Color (YIQ) – 6 dimensions– Texture (Gabor coefficient) – 60 dimensions

• 26 classes of SSR– 8 super classes (People(4), sky(3), ground(3), water(3),

foliage(3), mountain(2), building(3), interior(5))

• Kernel – polynomial with degree 2 and a constant.• Total data for train & test – 554 image regions from 138

images.• Training data – 375 image regions from 105 images.• Test data – 179 image regions.

Page 12: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

SSR Detection

Page 13: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

SSR Detection

• Feature vectors zc and zt (for color & texture)

• 3 color maps and 30 texture maps from Gabor Coefficient.

• Windows of different scales used for scale invariance.

• Each pixel will consolidate the SSR classification vector Ti(z)

Page 14: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Multiscale Reconciliation

• Object detected in different region in image

• Fusing multiple SSR detected from different image scale

• Comparing two detection map at a time (from 60 x 60 & 50 x 50 to 30 x 30 & 20 x 20)

• Smallest scan windows consolidating the result

Page 15: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Spatial Aggregation

• Summarize the reconciled detection map in larger spatial region.

• Spatial aggregation Map (SAM) variable emphasis (weights).

• SAM are invariant to image rotation & translation

• SAM effected slightly by change of angle of view, change of scale, occlusion.

Page 16: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Spatial Aggregation Map

Page 17: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Scalability

• Modular Nature

• Independent training of binary detectors.

• Parallel computation of feature map.

• Multiple SSR detection simultaneously

• Concurrent spatial aggregation by different nodes in SAM.

• Retraining of SVM with the addition of new SSR.

Page 18: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Query Methods

• Low-level features– QBE (Query By Example)– QBC (Query By Canvas)

• Semantic Information– QBK (Query By keywords)– QBS (Query By sketches)– QBSI (Query By Spatial Icons)

Page 19: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Query Formulation & Parsing

• QBME (Query by multiple examples)

• Similarity computed based on the similarity between their tessellated blocks.

• Larger block for similar semantics but different spatial arrangement.

• Smaller blocks for spatial specificity.

• City block distance provide best performance.

Page 20: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Query Formulation & Parsing

• QBSI (Query by Spatial icons)– Spatial arrangement of visual semantics– Q (Visual query term) specify region R for

SSR i.– Chaining of these term VQT.– Two level is-a hierarchy of SSRs– Use of max in abstract visual semantics.

Page 21: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Query Formulation & Parsing

• Disjunctive normal form of VQT can be used (with or without negation).

• Fuzzy operation to remove the uncertainty in values.

• Vocabulary for the QBSI limited by the semantics

• Graphical interface provided for VQT• Indexing the images with 3 x 3 spatial

tessellation with 26 SSR.

Page 22: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Experimental Results

• Tested on consumer images » More challenging & complex

» Diverse content

» Faded, over exposed, blurred, dark

» Different focuses, distances and occlusion.

• 2400 heterogeneous photos of a single family taken over the span of 5 years

• Indoor and outdoor settings• Resolution of 256 x 384 converted to 240 x 360• No pre-selection of images.

Page 23: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

QBME Experiment

• 24 semantic queries for 2400 images• Truth values based on the opinion of 3 subjects• Comparison with feature based approach (CTO).

» Best performing parameters selected

Page 24: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

QBSI Experiment

• 15 QBSI queries for 2400 photos

Query examples for QBSI

Page 25: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

QBSI Experiment Results

Precision on top retrieved images for QBSI experiment

Page 26: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Advantages of QBSI

• Explicit specification of visual semantics with combination

• Better and more accurate expression than sketches and visual icons.

Page 27: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Conclusion & Future Work

• SSR allows image indexing based on local semantics without region segmentation

• A unique and powerful query language.

• Extendable to other domains like medical images.

Page 28: A structured learning framework for content- based image indexing and visual Query (Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

Questions