tag ranking present by jie xiao dept. of computer science univ. of texas at san antonio

26
Tag Ranking Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio

Upload: gilbert-cole

Post on 31-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Tag Ranking

Present by Jie Xiao

Dept. of Computer Science

Univ. of Texas at San Antonio

[email protected] 2

Outline

Problem

Probabilistic tag relevance estimation

Random walk tag relevance refinement

Experiment

Conclusion

[email protected] 3

Problem

There are millions of social images on internet, which are very attractive for the research purpose.

The tags associated with images are not ordered by the relevance.

Problem (Cont.)

[email protected] 4

Tag relevance

There are two types of relevance to be considered.

The relevance between a tag and an image

The relevance between two tags for the same image.

[email protected] 5

Probabilistic Tag Relevance Estimation

Similarity between a tag and an image

[email protected] 6

x : an imaget : tag i associated with image xP(t|x) : the probability that given an image x, we have the tag t.P(t) : the prior probability of tag t occurred in the dataset

After applying Bayes’ rule, we can derive that

Probabilistic Relevance Estimation (Cont)

Since the target is to rank that tags for the individual image and p(x) is identical for these tags, we refine it as

[email protected] 7

Density Estimation

Let (x1, x2, …, xn) be an iid sample drawn from some distribution with an unknown density ƒ.

Two types of methods to describe the densityHistogram

Kernel density estimator

[email protected] 8

Histogram

[email protected] 9

Credit: All of Nonparametric Statistics via UTSA library

Kernel Density Estimation

[email protected] 10

Smooth function K is used to estimate the density

Kernel Density Estimation (Cont.)

Its kernel density estimator is

[email protected] 11

Probabilistic Relevance Estimation (Cont)

Kernel Density Estimation (KDE) is adopted to estimate the probability density function p(x|t).

[email protected] 12

Xi : the image set containing tag tixk : the top k near neighbor image in image set XiK : density kernel function used to estimate the probability|x| : cardinality of Xi

Relevance between tags

ti, tag i associated with image x

tj, tag j associated with image x

, the image set containing tag i

, the image set containing tag j

N: the top N nearest neighbor for image x

[email protected] 13

Relevance between tags (Cont.)

[email protected] 14

Relevance between tags (Cont.)

Co-occurrence similarity between tags

[email protected] 15

f(ti) : the # of images containing tag tif(ti,tj) : the # of images containing both tag ti and tag tjG : the total # of images in Flickr

Relevance between tags (Cont.)

[email protected] 16

Relevance between tags (Cont.)

Relevance score between two tags

[email protected] 17

where

Random walk over tag graph

P: n by n transition matrix.

pij : the probability of the transition from node i to j

[email protected] 18

rk(j): relevance score of node i at iteration k

Random walk

[email protected] 19

Random walk over tag graph (Cont.)

[email protected] 20

Experiments

Dataset: 50,000 image crawled from Flickr

Popular tags:

Raw tags: more than 100,000 unique tags

Filtered tags: 13,330 unique tags

[email protected] 21

Performance Metric

Normalized Discounted Cumulative Gain

(NDCG)

[email protected] 22

r(i) : the relevance level of the i - th tag

Zn : a normalization constant that is chosen so that the optimalranking’s NDCG score is 1.

Experimental Result

Comparison among different tag ranking approaches

[email protected] 23

Conclusion

Estimate the tag - image relevance by kernel density estimation.

Estimate the tag – tag relevance by visual similarity and tag co-occurrence.

A random walk based approach is used to refine the ranking performance.

[email protected] 25

[email protected] 26

Thank you!