optimizing learning with svm constraint for content-based image retrieval* steven c.h. hoi 1th...

Optimizing Learning with SVM Constraint forContent-based Image Retrieval*

Steven C.H. HoiSteven C.H. Hoi

1th March, 20041th March, 2004

*Note: The copyright of the presentation material is hold by the authors.

OutlineOutline

IntroductionIntroduction Related WorkRelated Work Optimizing Learning with SVM constraintOptimizing Learning with SVM constraint Experimental ResultsExperimental Results Discussions and Future WorkDiscussions and Future Work ConclusionsConclusions

IntroductionIntroduction In CBIR, there exists a gap between the high In CBIR, there exists a gap between the high

level semantics and the low level features level semantics and the low level features calculated by the computers.calculated by the computers.

To learn the associations between the human To learn the associations between the human perception and the low level features, relevance perception and the low level features, relevance feedback was proposed as a natural way to feedback was proposed as a natural way to solve this task.solve this task.

In a CBIR system, users will be asked to provide In a CBIR system, users will be asked to provide the relevance judgementthe relevance judgement on the query results. on the query results. Based on user’s feedbacks, the CBIR system Based on user’s feedbacks, the CBIR system refines the retrieval performance round-by-refines the retrieval performance round-by-round.round.

Difficulties in Relevance Feedback: high Difficulties in Relevance Feedback: high dimension feature space, small training samplesdimension feature space, small training samples

Related WorkRelated Work

Major techniques for Relevance FeedbackMajor techniques for Relevance Feedback Query-Point Movement: Rochhio’s formulaQuery-Point Movement: Rochhio’s formula

• Moving the ideal query point toward the positive examples and Moving the ideal query point toward the positive examples and away from the negative examplesaway from the negative examples

Re-weighting: MARSRe-weighting: MARS• Axis re-weighting: the inverse of the standard deviation of a Axis re-weighting: the inverse of the standard deviation of a

feature, say the j-th feature, is used as the weight for the feature, say the j-th feature, is used as the weight for the corresponding axis.corresponding axis.

Parameters optimization methods: MindReader, Parameters optimization methods: MindReader, Rui’00Rui’00

• By minimizing the total distance of positive examples to the By minimizing the total distance of positive examples to the ideal query pointideal query point

• Based on Generalized Ellipsoid DistanceBased on Generalized Ellipsoid Distance

Kernel-based Classification Techniques: SVMs, Kernel-based Classification Techniques: SVMs, Boosting, etc.Boosting, etc.

SVMSVM AdvantagesAdvantages

• Sound theoretical backgroundSound theoretical background• Minimizing Structure Risk rather than Empirical RiskMinimizing Structure Risk rather than Empirical Risk• Excellent Classification performanceExcellent Classification performance

Basic TheoryBasic Theory

Learning boundary with SVMLearning boundary with SVM

Considering the soft-margin, SVM has the form below:Considering the soft-margin, SVM has the form below:

The optimization problem can be solved by introducing The optimization problem can be solved by introducing the Lagrange multipliers.the Lagrange multipliers.

The derived decision function can be described as The derived decision function can be described as followsfollows

The distance function for SVM to measure the The distance function for SVM to measure the similarity for Image retrieval is typically given assimilarity for Image retrieval is typically given as

Limitations of SVMsLimitations of SVMs

Inaccurate boundary for few samplesInaccurate boundary for few samples In the initial round of relevance feedback, the In the initial round of relevance feedback, the

number of training samples is small, SVMs number of training samples is small, SVMs cannot accurately catch the data distributions. cannot accurately catch the data distributions.

Ranking problemRanking problem SVM is originally for classification purposes. SVM is originally for classification purposes.

Simply to take the distance from boundaries Simply to take the distance from boundaries of SVM as the distance function may not be of SVM as the distance function may not be effective enough to describe the data. effective enough to describe the data.

SolutionsSolutions A heuristic approachA heuristic approach

Guo in Ref [3] suggested a simple approach to embed the Guo in Ref [3] suggested a simple approach to embed the Euclidean Distance in SVM learning. In their scheme, the samples Euclidean Distance in SVM learning. In their scheme, the samples inside the boundary of SVM are measured by Euclidean Distance inside the boundary of SVM are measured by Euclidean Distance while the samples outside the boundary are evaluated by the while the samples outside the boundary are evaluated by the distance-from-boundary of SVMs.distance-from-boundary of SVMs.

However, the heuristic approach based on straight Euclidean However, the heuristic approach based on straight Euclidean Distance is not flexible and powerful enough to describe the data Distance is not flexible and powerful enough to describe the data distribution. Moreover, it lacks systematic formulation in distribution. Moreover, it lacks systematic formulation in mathematics.mathematics.

Optimizing Learning approachOptimizing Learning approach To best learn the similarity, we suggest a novel systematic To best learn the similarity, we suggest a novel systematic

scheme: the optimal learning with SVMs constraint. Our approach scheme: the optimal learning with SVMs constraint. Our approach can optimally learn the similarity for Image retrieval with the can optimally learn the similarity for Image retrieval with the Generalized Ellipsoid Distance.Generalized Ellipsoid Distance.

Optimizing Learning with SVM Optimizing Learning with SVM constraintconstraint

Basic IdeaBasic Idea The training samples are first learned by SVM to form a boundary The training samples are first learned by SVM to form a boundary

to separate the positive part and the negative ones.to separate the positive part and the negative ones. Then we learn the optimal similarity measure metric by positive Then we learn the optimal similarity measure metric by positive

samples with Generalized Ellipsoid Distance constrained with the samples with Generalized Ellipsoid Distance constrained with the boundary of SVMs.boundary of SVMs.

Comparison for the proposed method with previousComparison for the proposed method with previous

Problem formulation and notationsProblem formulation and notations N – denote the number relevant samplesN – denote the number relevant samples M – denote the number of features. M – denote the number of features. Given an image xGiven an image xnn, we use , we use xxnini = [x = [xni1ni1,…x,…xniLiniLi]] to to

represent the represent the i-i-th feature vector, where th feature vector, where LLii is the is the length of the length of the i-i-th feature vector.th feature vector.

Let qi = [ qi1,…,qiL1] denote the ideal query vector Let qi = [ qi1,…,qiL1] denote the ideal query vector of the i-th feature vector.of the i-th feature vector.

Generalized Ellipsoid Distance is described by a Generalized Ellipsoid Distance is described by a real symmetric full matrix Wreal symmetric full matrix Wi.i.

Optimization targetOptimization target

where where uu is used for feature weights and is used for feature weights and v v is provided is provided for the goodness value of positive samples to weight for the goodness value of positive samples to weight the importance of the samples.the importance of the samples.

To fuse the optimal learning model with SVM learning, To fuse the optimal learning model with SVM learning, we set the goodness value v(x) related to the SVM we set the goodness value v(x) related to the SVM distance in each iteration.distance in each iteration.

By introducing Lagrange multipliers, we can solve the By introducing Lagrange multipliers, we can solve the previous optimization function. We here only present the previous optimization function. We here only present the major conclusions as follows.major conclusions as follows.

The optimal solution to the ideal query point can be solved as The optimal solution to the ideal query point can be solved as

The optimal solution to the distance matrix can be solved asThe optimal solution to the distance matrix can be solved as

where is weighting covariance matrix of where is weighting covariance matrix of XiXi

The optimal solution to the vector The optimal solution to the vector uu can be solved as can be solved as

Distance Measure MetricsDistance Measure Metrics

where MaxDis is given aswhere MaxDis is given as

Experimental ResultsExperimental Results DatasetsDatasets

Natural images selected from COREL CDsNatural images selected from COREL CDs Two datasets: 20-Category and 50-CategoryTwo datasets: 20-Category and 50-Category Each category contains 100 images.Each category contains 100 images.

Feature RepresentationFeature Representation 9-dimensional Color Moment9-dimensional Color Moment 18-dimensional Edge Direction Histogram18-dimensional Edge Direction Histogram 9-dimensional Wavelet-based Texture9-dimensional Wavelet-based Texture

Experimental SettingsExperimental Settings Radial Basis Function for SVMsRadial Basis Function for SVMs To enable objective evaluation, we fix the parameters To enable objective evaluation, we fix the parameters

in the learning task for all compared algorithms. in the learning task for all compared algorithms.

Evaluation on 20-Cat dataset: Evaluation on 20-Cat dataset: Average Precision on Top-20Average Precision on Top-20

Evaluation on 50-Cat dataset: Evaluation on 50-Cat dataset:

Computational Complexity and Empirical Time Cost

Discussion and Future WorkDiscussion and Future Work

Feature Set SelectionFeature Set Selection In our SVM learning scheme, we do not In our SVM learning scheme, we do not

consider the feature set selection in the consider the feature set selection in the boundary learning. boundary learning.

We can further improve the retrieval We can further improve the retrieval performance by combining the feature set performance by combining the feature set selection in the SVM learning tasks.selection in the SVM learning tasks.

The efficiency problem may also be The efficiency problem may also be considered in the future work.considered in the future work.

ConclusionsConclusions In this talk, we present a novel scheme to learn In this talk, we present a novel scheme to learn

the relevance feedback task for image retrieval.the relevance feedback task for image retrieval. We suggest the approach by optimal learning We suggest the approach by optimal learning

with SVMs constraint.with SVMs constraint. Our scheme not only can Our scheme not only can utilizeutilize the advantages the advantages

of SVMs to learn the boundary in the high of SVMs to learn the boundary in the high dimension feature space, but also can exploit dimension feature space, but also can exploit the hidden similarity structure within the the hidden similarity structure within the boundary of SVMs well. boundary of SVMs well.

Compared with previous methods, our approach Compared with previous methods, our approach with systematic formulation can perform better with systematic formulation can perform better from the preliminary experimental results.from the preliminary experimental results.

ReferenceReference [1] Chu-Hong Hoi and Michael R. Lyu, [1] Chu-Hong Hoi and Michael R. Lyu, Optimizing Learning with SVM

Constraint for Content-Based Image Retrieval, Technical Report, Department of Computer Science and Engineering, The Chinese University of Hong Kong, March, 2004

[2][2] Yoshiharu Ishikawa, Ravishankar Subramanya, and Christos Faloutsos Yoshiharu Ishikawa, Ravishankar Subramanya, and Christos Faloutsos MindReader: Querying databases through multiple examplesMindReader: Querying databases through multiple examples (1998) (1998) Proc. 24th Int. Conf. Very Large Data Bases, VLDB, 1998Proc. 24th Int. Conf. Very Large Data Bases, VLDB, 1998

[3] Y. Rui and T.S. Huang.[3] Y. Rui and T.S. Huang. Optimizing learning in image retrievalOptimizing learning in image retrieval. IEEE . IEEE Conf. on CVPR, June 2000.Conf. on CVPR, June 2000.

[4] G. D. Guo, A. K. Jain, W. Y. Ma, and H. J. Zhang, [4] G. D. Guo, A. K. Jain, W. Y. Ma, and H. J. Zhang, Learning Similarity Learning Similarity Measure for Natural Image Retrieval with Relevance FeedbackMeasure for Natural Image Retrieval with Relevance Feedback, , IEEE IEEE Trans. on Neural NetworksTrans. on Neural Networks, vol. 13, No. 4, 811-820, July, 2002. , vol. 13, No. 4, 811-820, July, 2002.

optimizing learning with svm constraint for content-based image retrieval* steven c.h. hoi 1th...

Documents

svm slide

boundaries of svm

query results

distance function

ideal query pointby

ranking problem svm

negative examples reweighting

mars axis reweighting