computational intelligence in media indexing and retrieval
TRANSCRIPT
1
Computational Intelligence in Media Computational Intelligence in Media Indexing and RetrievalIndexing and Retrieval
Dr. Kim-Hui Yap
School of Electrical & Electronic Engineering Nanyang Technological University
April 2007
2
Goals of this TutorialGoals of this Tutorial
Introduce media indexing and retrievalmotivationbackgrounds applications
Discuss some challenges faced by media (image) retrievalDiscuss how Computational Intelligence can be used to address these issues
3
MediaMedia
This tutorial will use image indexing and retrieval to demonstrate the relevant concepts and techniquesSimilar ideas can be extended to other media retrieval systemsMedia Types
Graphics Image Video AnimationAudio Speech
4
Media Retrieval MethodologiesMedia Retrieval Methodologies
Text-based RetrievalKeyword-basedAnnotated by expert annotators
Content-based RetrievalCenter on audio-visual content analysisQuery-by-example
Metadata-based RetrievalGoogle/Yahoo search enginesWeb mining
Peer taggingAnnotation by voluntary peer users in a distributed mannerE.g. Flickr, YouTube
5
MotivationsMotivations
Motivations for development of efficient media retrieval systems:
Explosion in the volume of media data over the Internet and wireless networksIncreasing popularity of imaging devices such as digital camera, prevalence of low-cost high-capacity storage devices, and increasing proliferation of image data over communications networks.Emergence of new consumerism where media technologies meet consumers’ needs
6
ApplicationsApplications
Education (e.g. mobile learning)Military (e.g. surveillance applications) Healthcare (e.g. biomedicine)Information (e.g. media search engines)Social (e.g. MySpace, Facebook)
7
ApplicationsApplications
8
Tutorial OutlineTutorial Outline
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
9
GUIGUI
MEDIA RETRIEVALSEE HOW IT CAN BE
DONE USING:
COMPUTATIONAL INTELLIGENCE
WITHOUT THE FORCE
Find the Image…You must
25
10
IntroductionIntroduction
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
11
TextText--based Image Retrievalbased Image RetrievalTraditional text-based image retrieval search engines
Manual annotation of images by a few expertsUse keywords to denote images
Issues faced by traditional text-based image retrievalExhausting and time consumingHighly individual and subjective An image is worth a thousand words Some images are hard to describe using keywords
“Lake, house, tree, autumn, scenery…”
An image with complex textures that is hard to describe using keywords
“An image is worth a thousand words”
12
ContentContent--based Image Retrieval (CBIR)based Image Retrieval (CBIR)
ImageContent Analysis
Feature Vector
DecisionMaking
Feature Database
Query
Result Visualization
Image Database
Advantages over text-based image retrievalAlleviate intensive human laborAvoid subjectivityOffer an alternative approach to perform a query such as query-by-example (QBE)
13
Existing CBIR SystemsExisting CBIR Systems
QBIC (M. Flickher et al. 1995)MARS (Y. Rui et al. 1998)Virage (G. Amarnath et al. 1997)Photobook (A. Pentland et al. 1996)VisualSEEk (J. R. Smith et al. 1996)PicToSeek (T. Gevers et al. 1996)PicHunter (I. J. Cox et al. 2000)
14
Peer TaggingPeer Tagging--based Image Retrievalbased Image Retrieval
Distributed users (particularly Internet users) provide tags (keywords) to shared imagesLow cost, simple, easy, flexible and effort-sharingFlickr, YouTube
15
Issues and ChallengesIssues and Challenges
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
16
Challenging Problems in Image RetrievalChallenging Problems in Image Retrieval
Semantic gapSmall sample problemsImage region perceptual importancePeer tagging and knowledge propagation
17
Semantic GapSemantic Gap
Semantic gap exists between low-level visual features (color, texture, shape) and high-level human perceptionUncertainty in the correspondence between
The information that one can extract from the visual data The interpretation that the same data have for a user in a given situation
Color: redTexture: ruffled Shape: round
Users
Semantic gap
CBIR system
F lo w e r, R ose , P lan t
18
Visual Similarity Visual Similarity vsvs Semantic SimilaritySemantic Similarity
Can we address this perceptual ambiguity?
19
Small Sample ProblemSmall Sample Problem
Learning from a small number of training samples is a challenging problemImage labeling is a time consuming task and users are often unwilling to label too many imagesCan Computational Intelligence help to resolve this challenge?
20
Image Region Perceptual ImportanceImage Region Perceptual Importance
Global features sometimes fail to match users’ object-level perceptionsHow do we determine the importance of different image regions and perform comparison of image similarity based on these local regions?
21
Tagging and Knowledge PropagationTagging and Knowledge Propagation
Tagging can be cumbersome in certain cases, e.g. mobile mediaUsers sometimes will not annotate all the images Many existing image databases are not fully annotatedHow do we utilize correlation between visual content and annotated texts to propagate keywords from a small set of samples to the rest of database?
22
What is Computational Intelligence (CI)?What is Computational Intelligence (CI)?
Computational
Intelligence Techniques
Neural
Networks
Fuzzy
Logic
Evolutionary
Computations
Support Vector
Machines
A field of studies that attempts to simulate human intelligence using computational algorithms
23
Why Use Why Use Computational IntelligenceComputational Intelligence in in Image Retrieval?Image Retrieval?
Computational intelligence is used in image retrieval due to its capability in systematic signal identification, intelligent information integration, and robust optimization.We will demonstrate how they can be employed to address some of the challenges faced by image retrieval systems.
24
Structured Levels of Image Content AnalysisStructured Levels of Image Content Analysis
This tutorial will introduce the development of image retrieval systems from the low-level (color, texture, shape), medium-level (regions of attributes) to the high-level (keywords, tags).
Im agesMedia Layer
Feature Layer
Object Layer
Concept Layer
Color, texture, shape
Regions of attributes
Keywords, tags
25
Soft Relevance Framework for Fuzzy Soft Relevance Framework for Fuzzy PerceptionPerception
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perceptionPseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
26
Interactive CBIR SystemsInteractive CBIR Systems
ImageContent Analysis
Feature Vector
DecisionMaking
Feature Database
Relevance Feedback
Result Visualization
Query
Image Database
27
RelevanceRelevance FeedbackFeedback
Why relevance feedback?Narrow the semantic gapReduce user perceptual subjectivity
Initial retrieval result
User feedback
Improved result after learning
Display
System learning
Display
User feedback
User feedback
28
Fuzzy User PerceptionFuzzy User Perception
Query
Retrieval result
Relevant or irrelevant?
29
Previous ApproachesPrevious Approaches
Binary labeling (Y. Rui et al. 1997, P. Muneesawang et al. 2002)Binary feedback: relevant or irrelevantAdvantage: simple, easy to implementDisadvantage: hard decision, crisp logic without considering the degree of relevance
Multi-level labeling (Y. Rui et al. 1998, X. S. Zhou et al. 2001)Feedback with degree of relevance: highly-relevant, relevant, no-opinion, irrelevant, highly-irrelevantAdvantage: more information due to detailed descriptionDisadvantage: cumbersome and tedious
30
Hierarchical Structure of User Information Hierarchical Structure of User Information PrioritiesPriorities
31
Computational Intelligence in Soft Computational Intelligence in Soft Relevance FrameworkRelevance Framework
Fuzzy interpretationIntegrate potential imprecision of user perception into relevance feedback
Fuzzy labelingRelevant, irrelevant and fuzzy labelsAn a posteriori probability estimator is used to evaluate the relevance of fuzzy images
Machine learningA progressive fuzzy radial basis function network (PFRBFN) is developed to learn the user information needs.
32
Schematic OverviewSchematic Overview
Fuzzy Relevance Feedback
Query Selection
Feature Extraction
Similarity Comparison
Machine Learning
Feature Extraction
Database Creation
Display and Relevance Feedback
Offline Processing
Online Querying
PFRBFN
Euclidean Distance
Color Texture
33
FlowchartFlowchart Retrieve initial results based onk-nearest neighbor (k-NN) search
User feedback of relevant, irrelevant, andfuzzy images
Two-stage clustering to determine the clustercenters for relevant, irrelevant, and fuzzy subnets
Estimation of the soft relevance membershipfunction
Construction and training of the PFRBFN
Retrieve new images from database based ontrained PFRBFN
Have terminationcriteria been
satisfied?
End
Yes
No
34
Feature ExtractionFeature ExtractionConcatenate the features to form 170-dimensional feature vectors
0 10 , 6n nθ θ θ π+= = +
Features
Color histogram
Description Dimension
HSV space is chosen, each H , S , V com ponent is uniform lyquantized into 8, 2 and 2 bins respectively
32
Color auto-correlogram 64
Color moments The first two moments (mean and standard deviation) fromthe R, G , B color channels are extracted 6
Gabor wavelet
20W avelet momentsApplying the wavelet transform to the image with a 3-leveldecomposition, the mean and the standard deviation of thetransform coefficients are used to form the feature vector
48Gabor wavelet filters spanning four scales: 0.05, 0.1,0.2 and 0.4 and six orientations:are applied to the image. The mean and standard deviationof the Gabor wavelet coefficients are used to form thefeature vector
The chessboard distance is chosen as the distance m easure.T he im age is quan tized in to 4x4x4=64 co lo rs in the R G Bspace [11].
35
Main Features of PFRBFNMain Features of PFRBFN
The PFRBFN is developed with a few main considerations:
to integrate the unique batch process of user feedbackto reduce the computational time of training process to integrate the potential fuzzy feedbacks from the users
A two-stage clustering algorithm is employed to simplify the network by reducing the number of hidden neurons. An efficient gradient descent-based learning strategy is employed to estimate the underlying network parameters by minimizing a cost function.
36
Schematic Diagram of PFRBFNSchematic Diagram of PFRBFN
1
2
R
xx
x
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
xM
( )tF x
1 ( )tF − x
( )tU x
1x
2x
Rx
( )tU x
Input layer Subnets Output layer
1
tPw
IP
tPw
1
tNw
IN
tNw
1
tFw
IF
tFw
1( ) ( ) 1,2, ,( )
0 0t t
t
F U t TF
t− + =⎧
= ⎨ =⎩
x xx
K
37
PFRBFN Training (I)PFRBFN Training (I)
1
( | ) ( )1( )( | ) ( ) ( | ) ( )
Mjm r r
jm jm r r jm i i
p Ps
M p P p Pω ω
ω ω ω ω=
=+∑
xx
x x
Two-stage clustering for PFRBFN subnet creationSubtractive clustering Fuzzy C-means (FCM)
A posteriori estimator for soft relevance membership estimation of fuzzy images
11 22
1 1( | ) exp ( ) ( )2(2 )
q q qjm q jm m m jm md qm
m
p Τ −⎡ ⎤= − − −⎢ ⎥⎣ ⎦∑
∑x x xω
πμ μ
38
PFRBFN Training (II)PFRBFN Training (II)
Kernel functionT
2
( ) ( )( , , ) exp , =1, 2, , , 1, 2, , 2( )
t tt t i i
i i ti
f t T i Iα αα α α
α
σσ
⎛ ⎞− −= − =⎜ ⎟
⎝ ⎠
x v x vx v K KΛ
Desired output0
( ) 1( )
j
t j j
j j
NY P
s F
⎧ ∈⎪= ∈⎨⎪ ∈⎩
xx x
x x
PFRBFN output
1{ , , } 1
( ) ( , , ) 1, 2, ,( )
0 0
It t t
t i i iP N F it
F w f t TF
t
α
α α αα
σ−∈ =
⎧+ =⎪= ⎨
⎪ =⎩
∑ ∑x x vx
K
Error function
( )22
1 1
1 1 ( ) ( )2 2
T TN N
t jt t j t jj j
E e Y F= =
= = −∑ ∑ x x
39
PFRBFN Training (III)PFRBFN Training (III)Gradient descent learning
(1) Weight estimation at the k-th learning iteration
(2) Center estimation at the k-th learning iteration
(3) Width estimation at the k-th learning iteration
( )2
1
1arg min( ) arg min ( ) ( )2
TN
t t j t jj
E Y F∈ ∈ =
⎛ ⎞= = −⎜ ⎟⎜ ⎠⎝
∑ x xθ Θ θ Θ
θ
1
( ) ( ) ( , ( ), ( ))( )
TNt tt
jt j i itji
E k e k f k kw k α α
α
σ=
∂= −
∂ ∑ x v
1( )( 1) ( ) , { , , }, 1,2, ,( )
t t ti i t
i
E kw k w k P N F i Iw kα α α
α
η α∂+ = − ∈ =
∂K
T
31
( ( )) ( ( ))( ) ( ) ( ) ( , ( ), ( ))( ) ( ( ))
Tt tN
j i j it t tti jt j i it t
ji i
k kE k w k e k f k kk k
α αα α α
α α
σσ σ=
− −∂= −
∂ ∑x v x v
x vΛ
3( )( 1) ( ) , { , , }, 1, 2, ,( )
t t ti i t
i
E kk k P N F i Ikα α α
α
σ σ η ασ
∂+ = − ∈ =
∂K
21
( ( ))( ) ( ) ( ) ( , ( ), ( ))( ) ( ( ))
TtN
j it t tti jt j i it t
ji i
kE k w k e k f k kk k
αα α α
α α
σσ=
−∂= −
∂ ∑x v
x vv
Λ
2( )( 1) ( ) , { , , }, 1, 2, ,( )
t t ti i t
i
E kk k P N F i Ikα α α
α
η α∂+ = − ∈ =
∂v v
vK
(4) Repeat steps (1)-(3) until convergence or a maximum number of iterations is reached
40
Graphical User InterfaceGraphical User Interface
41
Image DatabaseImage Database
100 categories, 10,000 color images
42
Objective System Performance (I)Objective System Performance (I)
Experimental Setup:100 queriesTop 25 resultsAverage precision-versus-recall graph of the PFRBFN and ARBFN (P. Muneesawang et al. 2002) methods after 5 iterations
43
Objective System Performance (II)Objective System Performance (II)
Experimental Setup:100 queriesRetrieval accuracy of the PFRBFN and ARBFN methods in top 25 results
44
Subjective System PerformanceSubjective System Performance
Experimental Setup:150 queries Retrieval accuracy of the PFRBFN, ARBFN and MARS methods in top 25 result
45
Case Study (I)Case Study (I)
. (a) Initial retrieval results based on k-NN search
Objective: Looking for some home pets, especially dogs
46
Case Study (II)Case Study (II)
(b) Retrieval results with user marking the cat images as irrelevant
47
Case Study (III)Case Study (III)
(c) Retrieval results with user marking the cat images as relevant
48
Case Study (IV)Case Study (IV)
(d) Retrieval results with user marking the cat images as fuzzy
49
PseudoPseudo--labeling and Machine Learninglabeling and Machine Learning
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
50
Small Sample ProblemSmall Sample Problem
Small sample problemRelevance feedback in CBIR systems uses only the labeled images for learningImage labeling is a time consuming task and users are often unwilling to label too many images
Challenge Learning from a small number of training samples
What is the solution?
51
Incorporating Unlabeled ImagesIncorporating Unlabeled ImagesDiscriminant Expectation Maximization (D-EM)
Incorporate unlabeled samples to estimate the underlying probability distribution Y. Wu et al. 2000
Transductive support vector machine (TSVM) Incorporate unlabeled images to train an initial SVM, followed by standard active learning T. Joachims et al. 1999, L. Wang et al. 2003
Support vector machine (SVM) with prior knowledge Incorporate prior knowledge into the SVM L. Wang et al. 2004
52
PseudoPseudo--labelinglabeling
Idea of existing pseudo-labeling methods Obtaining a large number of labeled images is labor intensive while unlabeled images are readily availableUtilize the unlabeled images to augment the available labeled images Each selected, unlabeled image is assigned a pseudo-label of either ‘relevant’ or ‘irrelevant’ based on a proposed algorithm
Shortcoming:Pseudo-labeled images are fuzzy in nature as they are not explicitly labeled by the usersThe potential imprecision embedded in their class information should be taken into consideration
53
Computational Intelligence in PseudoComputational Intelligence in Pseudo--labelinglabeling
Fuzzy support vector machine with active learningSoft membership estimation of unlabeled imagesLabel propagation Two-stage clustering of labeled samples into relevant or irrelevant classes
54
Active LearningActive Learning
Active learning is designed to achieve maximal information gain or minimize uncertainty in decision making Active learning selects the most informative samples to query the users for labelingSVM-based active learning
Select samples that are closest to the current SVM decision boundary as the most informative pointsSamples that are farthest away from the boundary and on the positive side are considered as the most relevant images
55
Proposed PseudoProposed Pseudo--labeling Framework for labeling Framework for CBIRCBIR
(1) Perform k-nearest neighbors (K-NN) search and return the top most similar images to the user for feedback
(2) User provides feedback as either relevant or irrelevant on the images, an initial SVM classifier is trained
(3) The SVM active learning is employed by selecting l unlabeled images that are closest to the current SVM decision boundary forthe user to label
(4) After the user labels the l images, add them to the previously labeled training set
(5) A two-stage clustering is performed separately on the labeled relevant and irrelevant images. The formed clusters are then used for unlabeled images selection and pseudo-label assignment
(6) A fuzzy metric is employed to evaluate the soft relevance membership of the pseudo-labeled images
(7) An FSVM is trained using a hybrid of the labeled and pseudo-labeled images
(8) Repeat steps (3)-(7) until the retrieval performance is satisfactory
0l
0l
56
Support Vector MachinesSupport Vector Machines
Map the input data into a high-dimensional feature space through a mapping function
Find the optimal separating hyperplanewith minimal classification errors in this space:
0b⋅ + =w z
57
Soft Errors of Fuzzy SVMSoft Errors of Fuzzy SVM
Figure source: Ronan CollobertDalle Molle Institute for Perceptual Artificial Intelligence (IDIAP)
58
Soft Membership Estimation of PseudoSoft Membership Estimation of Pseudo--labeled Imageslabeled Images
Τ Τ
1 Τ Τ1
min ( ) ( ) min ( ) ( )exp if 1
( ) min ( ) ( ) min ( ) ( )
0 otherwise
P Si P Si P Si P Sii i
P P Oj P Oj P Oj P Ojj j
aw
⎧ ⎛ ⎞− − − −⎪ ⎜ ⎟− <⎪ ⎜ ⎟= − − − −⎨ ⎝ ⎠⎪⎪⎩
x v x v x v x v
x x v x v x v x v
22
2
1 pseudo-label is positive1 exp( )
( )1 otherwise
1 exp( )
P
a yw
a y
⎧⎪ + −⎪= ⎨⎪⎪ +⎩
x
1 2( ) ( ) ( )P P Pg w w=x x x
Objective: find a fuzzy membership mapping that assigns a relevance value [0, 1] to each pseudo-labeled image
The membership function depends on two factors:Distance of the pseudo-label images to other label images (w1)Agreement between the predicted labels obtained during the pseudo-labeling process and using the trained SVM (w2)
59
Fuzzy Support Vector MachineFuzzy Support Vector Machine
Fuzzy membership is introduced to reflect different contributions of the input (label and pseudo-labeled images)
2
1
1m inimize 2
subject to ( ) 1 , 0, 1, ,
n
i ii
i i i i
C
y b i n
μ ξ
ξ ξ=
+
⋅ + ≥ − ≥ =
∑w
w z K
iμ
Optimization problem of FSVM can be transformed into its dual problem
1 1 1
1
1maximize ( , )2
subject to 0, 0 , 1, ,
n n n
i i j i j i ji i j
n
i i i ii
y y K
y C i n
α α α
α α μ
= = =
=
−
= ≤ ≤ =
∑ ∑∑
∑
x x
K
60
System Performance (I)System Performance (I)
Experimental Setup:100 queriesFive feedback iterations Average precision-versus-recall graphs of PLFSVM and SVM (S. Tong et al. 2001) methods after the first iteration of active learning for 0 10l =
61
System Performance (II)System Performance (II)
Retrieval accuracy of the PLFSVM and SVM methods in top 10 results for 0 10l =
62
RegionRegion--based Image Retrieval (RBIR) based Image Retrieval (RBIR) Using Neural NetworkUsing Neural Network
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learningRegion-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
63
Challenges in RBIR SystemsChallenges in RBIR Systems
Issues in RBIR systemsChoice of image similarity metric to decrease the impact of imprecise segmentationDetermine the importance of different image regionsLearning algorithm to progressively improve the retrieval accuracy through user interaction
64
VariableVariable--Length Radial Basis Function Length Radial Basis Function Network (VLRBFN)Network (VLRBFN)
We introduce a VLRBFN to model and learn user perception of image similarity in RBIR systems
The VLRBFN is developed for variable-length region-based image representation in RBIR systemsA systematic region weight learning strategy is introducedA kernel function centered on variable-length representation (VLR) is introduced
65
General RBIR OverviewGeneral RBIR Overview
Relevance Feedback
Query Selection
Image Segmentation
Feature Extraction
Similarity Comparison
Machine Learning
Feature Extraction
Image Segmentation
Database Creation
Display and Relevance Feedback
Offline Processing
Online queryingVLRBFN
EMD distance
Mean-shift algorithm
Color Texture
66
RegionRegion--based Image Representationbased Image Representation
Mean-shift algorithmMode seeking in non-parametric distributions Partition images into homogenous regions
Extracted features for the regionsColor feature: color momentsTexture feature: wavelet moments
Perceptual determination of region weightsReflect the importance of the regionsCriteria: area, location
67
Region SegmentationRegion Segmentation
68
Region Weight LearningRegion Weight Learning
Perceptual Determination Determine the most salient region in each relevant image such that these regions jointly capture the semantic class of the user’s queryUse perceptual importance (area and location) together with feedback information
Density Estimation Utilize the set of selected regions from all the relevant images for weight estimation of other unseen regionsEmploy one-class SVM (OCSVM) to estimate their density distributionImportance of a test region is evaluated by determining how it differs from the estimated distribution
69
Image Similarity MetricImage Similarity Metric
Earth Mover’s Distance (EMD)Measure the least amount of work needed to transform one image distribution into the other Operate on variable-length representations Suitable for region-based image similarity comparisonY. Rubner et al. 2002
Variable-length representation (VLR)Each image is described by sets of weighted featuresThe size of the set may vary depending on the number of regions in different images
70
Structure of Progressive VLRBFNStructure of Progressive VLRBFN
( )tU X
1( )tF − X
( )tF X1{( , ) }mk k kW == CX
input layer hidden layer output layer
( )tU X
2tw
1tw
tiw
tcw
( , )m mWC
2 2( , )WC
1 1( , )WC
1 ( ) ( ) 1, 2, ,( )
0 0t t
t
F U t TF
t− + =⎧
= ⎨ =⎩
KX XX
71
VLRBFN TrainingVLRBFN Training (I)(I)
Kernel function 2
2
EM D ( , )( , , ) exp , =1, 2, , , 1, 2, ,
2( )
tj it t
j i i ti
f t T i cσσ
⎛ ⎞= − =⎜ ⎟⎜ ⎟
⎝ ⎠K K
X VX V
VLRBFN output
11
( ) ( , , ) 1, 2, ,( )
0 0
ct t t
t j i j i iit j
F w f t TF
t
σ−=
⎧ + =⎪= ⎨⎪ =⎩
∑ KX X VX
Error function
( )22
1 1
1 1 ( ) ( )2 2
T TN N
t jt t j t jj j
E e Y F= =
= = −∑ ∑ X X
1 =1
1 =1
( , )EMD( , ) =
m nij pi qji j
p q m niji j
f d
f=
=
∑ ∑∑ ∑
C CX X
where
72
VLRBFN Training (II)VLRBFN Training (II)Gradient descent learning
( )2
1
1arg min ( ) arg min ( ) ( )2
TN
t t j t jj
E Y F∈ ∈ =
⎛ ⎞= = −⎜ ⎟⎜ ⎠⎝
∑ X Xθ Θ θ Θ
θ
(1) Weight estimation at the l-th learning iteration
(2) Width estimation at the l-th learning iteration
1
( ) ( ) ( , ( ), ( ))( )
TNt tt
jt j i itji
E l e l f l lw l
σ=
∂= −
∂ ∑ X V
1( )( 1) ( ) , 1, 2, ,( )
t t ti i t
i
E lw l w l i cw l
η ∂+ = − =
∂K
2
31
EMD ( , ( ))( ) ( ) ( ) ( , ( ), ( ))( ) ( ( ))
TtN
j it t tti jt j i it t
ji i
lE l w l e l f l ll l
σσ σ=
∂= −
∂ ∑X V
X V
2( )( 1) ( ) , 1, 2, ,( )
t t ti i t
i
E ll l i cl
σ σ ησ
∂+ = − =
∂K
(3) Repeat steps (1)-(2) until convergence or a maximum number of iterations is reached
73
System Performance (I)System Performance (I)Experimental setup
100 queriesAverage precision-versus-recall graph in top 25 returned images after 1 and 5 feedback iterations
ComparisonGRBFN method, K. H. Yap et al. 2005RQPM and RSVMmethods, F. Jing et al. 2004
74
System Performance (II)System Performance (II)
Observations of developed methodBetter retrieval performance than both the GRBFN and RQPM methodsComparable retrieval performance to that of the RSVM method, butcomputationally more efficient
75
System Performance (III)System Performance (III)Observation
Consistently provide superior results when compared with the GRBFN and RQPM methodsAchieve comparable retrieval performance to that of the RSVM methodSeven times faster than the RSVM method
76
Peer Tagging and Knowledge PropagationPeer Tagging and Knowledge Propagation
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learningRegion-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
77
Hybrid KeywordHybrid Keyword-- and Contentand Content--based Image based Image RetrievalRetrieval
Retrieving images via low-level features alone cannot achieve satisfactory resultsKeywords are the best descriptors for image semanticsBuild interactive CBIR systems which support high-level semantic query
Integrate the strengths of content- and keyword-based image indexing and retrieval algorithms Alleviate their respective difficulties
78
Peer Tagging in Image RetrievalPeer Tagging in Image Retrieval
79
Peer TaggingPeer TaggingMotivation
Challenge of semantic gapFull manual annotation of complete database is tedious and expensive
Peer taggingDistributed users (particularly Internet users) provide tags (keywords) to shared images
Advantages of peer taggingDistribute annotation workload to multiple voluntary contributors Low cost, simple, easy, flexible and effort-sharing
Online media (image and video) tagging systemsFlickr, YouTube
80
Issues in Peer TaggingIssues in Peer Tagging
Human-computer interface (HCI) for image tagging Tagging granularity (tagging on the segmented regions or the global images)Provide a user-friendly environment for image tagging
Tag generation and formationTag reliability (spelling mistake, inaccurate, irrelevant, or esoteric tags)Suggestive tagging Tag cloud in Flickr.com
Tag clusteringIdentify groups of tags sharing similar semantic conceptsTag clustering in Flickr.com
81
Knowledge PropagationKnowledge Propagation
Issues involved in the peer tagging servicesThe correlation between the tags and the media contents need to be exploredOnly a fraction of the images out of the complete collection is annotated
Knowledge propagationPropagate keywords from a small sample to the whole population
Keyword annotated
images
82
Knowledge PropagationKnowledge Propagation
SendSave
Gorilla Monkey
Bear Lion
Previous More
Match Quick Facts•Gorillas are the biggest primate•Gorillas, chimpanzees and humans are classified under the same family—Hominidae•Gorillas are mainly folivorousthough they may supplement their diet with insects and small animals•Gorillas live to approximately 35-40yrs in the wild and about 50 years in captivity
Previous Share More
Photographs the animal Sends MMS to zoo’s network server
Matches the animal and retrieves more information
Shares his photograph with the zoo’s database
Q: What is the key?A: Domain information
83
Knowledge PropagationKnowledge Propagation
Basic idea: perform analysis of media contents and their correspondence with the annotated keywords. Media content descriptors can be extracted and represented usingfeature vectors. SVM-based classifier can then be trained to map the feature vectors to soft labels reflecting the likelihood that a keyword matches an image/video. The SVM-based classifiers are trained based on the media data annotated by peer users. Once these classifiers are fully trained, they will be used to propagate keywords with associated probability to other unannotated media in the data collection.
84
ConclusionConclusion
IntroductionImage indexing and retrieval basicsIssues and challenges
Computational Intelligence in Image RetrievalSoft relevance framework for fuzzy perception Pseudo-labeling and machine learning Region-based image retrieval using neural networkPeer tagging and knowledge propagation
Conclusion
85
ConclusionConclusion
This tutorial shows that computational intelligence is instrumental in addressing some challenging issues in image indexing and retrieval:
Issue: Fuzzy user perceptionTechnique: Soft relevance framework to integrate the users’fuzzy perception of visual contentsIssue: Small sample problemTechnique: Pseudo-labeling, active learning and FSVMIssue: Image region perceptual importanceTechnique: Perceptual region estimation and VLRBFN Issue: Peer tagging and knowledge propagationTechnique: Develop domain-based keyword propagation
86
Selected Recent PublicationsSelected Recent Publications
Kim-Hui Yap and Kui Wu, “A soft relevance framework in content-based image retrieval systems,” IEEE Trans. Circuits and Systems for Video Technology, vol. 15, no. 12, pp. 1557–1568, Dec. 2005Kui Wu and Kim-Hui Yap, “Fuzzy SVM for content-based image retrieval - A pseudo-label support vector machine framework,” IEEE Computational Intelligence Magazine, vol. 1, pp. 10–16, May 2006K. Wu and K.-H. Yap, “Content-based image retrieval using fuzzy perceptual feedback,” accepted for publication in Multimedia Tools and Applications, Kluwer.K. Wu and K.-H. Yap, “A perceptual subjectivity notion in interactive content-based image retrieval systems,” in Intelligent Multimedia Processing with Soft Computing, Springer-Verlag, pp. 55-73, 2005.Kui Wu and Kim-Hui Yap, “Region-based image retrieval using radial basis function network,” in Proc. IEEE Int. Conf. Multimedia & Expo, Toronto, Canada, 2006
87
References (I)References (I)
M. Flickher, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: The QBIC system,” IEEE Computer, vol. 28, no. 9, pp. 23–32, Sept. 1995.Y. Rui, T. S. Huang, and S. Mehrotra, “Content-based image retrieval with relevance feedback in MARS,” in Proc. IEEE Int. Conf. Image Processing, Washington D.C., USA, pp. 815–818, 1997.G. Amarnath and J. Ramesh, “Visual information retrieval,” Communications of ACM, vol. 40, no. 5, pp. 70–79, May 1997.A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: content-based manipulation of image databases,” International Journal of Computer Vision, vol. 18, no. 3, pp. 233–254, June 1996.J. R. Smith and S. F. Chang, “VisualSEEk: a fully automated content-based image query system,” in Proc. ACM Multimedia, pp. 87–98, Nov. 1996.T. Gevers and A. W. M. Smeulders, “PicToSeek: Combining color and shape invariant features for image retrieval,” IEEE Trans. Image Processing, vol. 9, pp. 102–119, 2000.I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, and P. N. Yianilos, “The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments,” IEEE Trans. Image Processing, vol. 9, no. 1, pp. 20–37, 2000.A. M. W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no.12, pp. 1349–1380, Dec 2000.
88
References (II)References (II)Y. Rui, T. S. Huang, M. Ortega, and S. Mehrotra, “Relevance feedback: A power tool for interactive content-based image retrieval,” IEEE Trans. Circuits and Video Technology, vol. 8, no. 5, pp. 644–655, 1998.W. Y. Ma and B. S. Manjunath, “NeTra: A toolbox for navigating large image databases,” Multimedia System, vol. 7, no. 3, pp. 184–198, 1999.J. Z. Wang, J. Li, and G. Weiderhold, “SIMPLIcity: Semantic-sensitive integrated matching for picture libraries,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no.9, pp.947–963, 2001.C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image segmentation using expectation-maximization and its application to image querying,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp.1026–1038, 2002.Y. X. Chen and J. Z. Wang, “A region-based fuzzy feature matching approach to content-based image retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no.9, pp.1252–1267, 2002.F. Jing, M. J. Li, H. J. Zhang, and B. Zhang, “Relevance feedback in region-based image retrieval,” IEEE Trans. Circuits and Systems for Video Technology, vol.14, no. 5, pp. 672–681, 2004.R. Zhang and Z. Zhang, “Hidden semantic concept discovery in region based image retrieval,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, pp. 996–1001, June 2004.Z. Stejic, Y. Takama, and K. Hirota, “Relevance feedback-based image retrieval interface incorporating region and feature saliency patterns as visualizable image similarity criteria,” IEEE Trans. Industrial Electronics, vol. 50, no. 5, pp. 839–852, 2003.
89
References (III)References (III)P. Muneesawang and L. Guan, “Automatic machine interactions for content-based image retrieval using a self-organizing tree map architecture,” IEEE Trans. Neural Networks, vol. 13, no. 4, pp. 821–834, July 2002.Y. Wu, Q. Tian, and T. S. Huang, “Discriminant-EM algorithm with application to image retrieval,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, South Carolina, pp. 222–227, 2000.L. Wang and K. L. Chan, “Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, pp. 629–634, 2003.S. Tong and E. Chang, “Support vector machine active leaning for image retrieval,” in Proc. ACM Int. Conf. Multimedia, Ottawa Canada, pp. 107–118, 2001.Y. Rubner, C. Tomasi, and L. Guibas, “The Earth Mover’s Distance as a metric for image retrieval,” International Journal of Computer Vision, vol. 40, pp. 99–123, 2000.J. Jeon, V. Lavrenko and R. Manmatha, “Automatic image annotation and retrieval using cross-media relevance models,” in Proc. Int. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 119–126, 2003.E. Chang, G. Kingshy, G. Sychay, and G. Wu, “CBSA: Content-based soft annotation for multimodal image retrieval using Bayes point machines,” IEEE Trans. Circuits and Systems for Video Technology, vol. 13, no.1, pp. 26–38, Jan. 2003.V. N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.S. Haykin, Neural Networks a Comprehensive Foundation. Upper Saddle River, NJ: Prentice-Hall, 1999.
90
THANK YOU! ☺
Questions?