tag ranking present by jie xiao dept. of computer science univ. of texas at san antonio

Tag Ranking

Present by Jie Xiao

Dept. of Computer Science

Univ. of Texas at San Antonio

[email protected] 2

Outline

Problem

Probabilistic tag relevance estimation

Random walk tag relevance refinement

Experiment

Conclusion

[email protected] 3

Problem

There are millions of social images on internet, which are very attractive for the research purpose.

The tags associated with images are not ordered by the relevance.

Problem (Cont.)

[email protected] 4

Tag relevance

There are two types of relevance to be considered.

The relevance between a tag and an image

The relevance between two tags for the same image.

[email protected] 5

Probabilistic Tag Relevance Estimation

Similarity between a tag and an image

[email protected] 6

x : an imaget : tag i associated with image xP(t|x) : the probability that given an image x, we have the tag t.P(t) : the prior probability of tag t occurred in the dataset

After applying Bayes’ rule, we can derive that

Probabilistic Relevance Estimation (Cont)

Since the target is to rank that tags for the individual image and p(x) is identical for these tags, we refine it as

[email protected] 7

Density Estimation

Let (x1, x2, …, xn) be an iid sample drawn from some distribution with an unknown density ƒ.

Two types of methods to describe the densityHistogram

Kernel density estimator

[email protected] 8

Histogram

[email protected] 9

Credit: All of Nonparametric Statistics via UTSA library

Kernel Density Estimation

[email protected] 10

Smooth function K is used to estimate the density

Kernel Density Estimation (Cont.)

Its kernel density estimator is


Probabilistic Relevance Estimation (Cont)

Kernel Density Estimation (KDE) is adopted to estimate the probability density function p(x|t).


Xi : the image set containing tag tixk : the top k near neighbor image in image set XiK : density kernel function used to estimate the probability|x| : cardinality of Xi

Relevance between tags

ti, tag i associated with image x

tj, tag j associated with image x

, the image set containing tag i

, the image set containing tag j

N: the top N nearest neighbor for image x


Relevance between tags (Cont.)



Co-occurrence similarity between tags


f(ti) : the # of images containing tag tif(ti,tj) : the # of images containing both tag ti and tag tjG : the total # of images in Flickr


Relevance score between two tags


where

Random walk over tag graph

P: n by n transition matrix.

pij : the probability of the transition from node i to j


rk(j): relevance score of node i at iteration k

Random walk


Random walk over tag graph (Cont.)


Experiments

Dataset: 50,000 image crawled from Flickr

Popular tags:

Raw tags: more than 100,000 unique tags

Filtered tags: 13,330 unique tags


Performance Metric

Normalized Discounted Cumulative Gain

(NDCG)


r(i) : the relevance level of the i - th tag

Zn : a normalization constant that is chosen so that the optimalranking’s NDCG score is 1.

Experimental Result

Comparison among different tag ranking approaches


Conclusion

Estimate the tag - image relevance by kernel density estimation.

Estimate the tag – tag relevance by visual similarity and tag co-occurrence.

A random walk based approach is used to refine the ranking performance.



Thank you!

tag ranking present by jie xiao dept. of computer science univ. of texas at san antonio

Documents

tag relevance estimationrandom

tag graphp

tag tjg

tag jn

tag relevancethere

tag tixk

tag tifti

types of relevance