e6885 network science lecture 11cylin/course/netsci/netsci... · prior work on part-based object...

44
© 2013 Columbia University E6885 Network Science Lecture 11: Knowledge Graphs E 6885 Topics in Signal Processing -- Network Science Ching-Yung Lin, Dept. of Electrical Engineering, Columbia University November 25th, 2013

Upload: others

Post on 13-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

E6885 Network Science Lecture 11: Knowledge Graphs

E 6885 Topics in Signal Processing -- Network Science

Ching-Yung Lin, Dept. of Electrical Engineering, Columbia University

November 25th, 2013

Page 2: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University2 E6885 Network Science – Lecture 11: Knowledge Graphs

Course Structure

Class Date Lecture Topics Covered

09/09/13 1 Overview of Network Science

09/16/13 2 Network Representation and Feature Extraction

09/23/13 3 Network Paritioning, Clustering and Visualization

09/30/13 4 Network Analysis Use Case

10/07/13 5 Network Sampling, Estimation, and Modeling

10/14/13 6 Network Topology Inference

10/21/13 7 Network Information Flow

10/28/13 8 Dynamic & Probabilistic Networks and Graph Database

11/11/13 9 Final Project Proposal Presentation

11/18/13 10 Graph Databases II

11/25/13 11 Knowledge Graphs

12/02/13 12 Large-Scale Network Processing System

12/09/13 13 Final Project Presentation – I

12/16/13 14 Final Project Presentation – II

Page 3: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Relational Term-Suggestion

What keywords should I put in the search box to get the information I really want? Q.

Page 4: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Term Suggestion and Query Expansion

Log-based

Multi-partite Network Analytics

WordNet WikipediaInfluenced by test collection characteristics

Simple concept links only

Limited semantic relatedness

Difficult to update

Extracting human factor

Incorporateexpertise

Networkcommunity -based

Click log, biased in favor of top ranks

Query log,failure for rare queries

Document-based

Ontology-based

Multi-partite network analytics

Not publicly available

Page 5: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Influenced by test collection characteristics

No consideration of key terms that are highly semantically related but do not frequently co-occur.

Influenced

Document-based

apple juiceapple tree

apple storeapple TV

Kim, M. AND Choi, K. A. 1999. Comparison of collocation-based similarity measures in query expansion. Information Processing and Management 35 (1999), 19-30.

Page 6: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Term Suggestion and Query Expansion

Log-based

Multi-partite Network Analytics

WordNet WikipediaInfluenced by test collection characteristics

Simple concept links only

Limited semantic relatedness

Difficult to update

Extracting human factor

Incorporateexpertise

Networkcommunity -based

Click log, biased in favor of top ranks

Query log,failure for rare queries

Document-based

Ontology-based

Multi-partite network analytics

Not publicly available

Page 7: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Log-based Cluster queries with similar clicked URLs

Identifying the mapping between queries and clicked URLs

Pet food

Dog food

BAEZA-YATES, R., AND TIBERI, A. 2007. Extracting Semantic Relations from Query Logs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), 76-85.

Page 8: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Term Suggestion and Query Expansion

Log-based

Multi-partite Network Analytics

WordNet WikipediaInfluenced by test collection characteristics

Simple concept links only

Limited semantic relatedness

Difficult to update

Extracting human factor

Incorporateexpertise

Networkcommunity -based

Click log, biased in favor of top ranks

Query log,failure for rare queries

Document-based

Ontology-based

Multi-partite network analytics

Not publicly available

Page 9: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

WordNet as Ontology

Manuallyconstructed system based on individual words benefit will be limited

System is not easily updated

Pedersen, T, Patwardhan, S and Michelizzi, J. "WordNet::Similarity - Measuring the Relatedness of Concepts"  2004 In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI-2004) pp. 1024-1025.

Page 10: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Wikipedia as Ontology

Page 11: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Wikipedia is a web-based free encyclopedia that anyone can edit.

The English Wikipedia edition

2.4 million articles

1 billion words.

Wikipedia relies on the power of collective intelligence

by peer-reviewed approaches rather than the authority of individual.

high quality,

almost noise free.

Wikipedia as Ontology

Page 12: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Previous Approaches

Merely as an online dictionary and utilize it only as a structured knowledge database

Using associated hyperlinks

MILNE, D., WITTEN, I. H., AND NICHOLS, D. 2007. A Knowledge-Based Search Engine Powered by Wikipedia. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), 445-454..

Page 13: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Term Suggestion and Query Expansion

Log-based

Multi-partite Network Analytics

WordNet WikipediaInfluenced by test collection characteristics

Simple concept links only

Limited semantic relatedness

Difficult to update

Extracting human factor

Incorporateexpertise

Networkcommunity -based

Click log, biased in favor of top ranks

Query log,failure for rare queries

Document-based

Ontology-based

Multi-partite network analytics

W 2.0Not publicly available

Page 14: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Log-based

WordNet WikipediaInfluenced by test collection characteristics

Simple concept links only

Limited semantic relatedness

Difficult to update

Click log, biased in favor of top ranks

Not publicly available

Query log,failure for rare queries

Ontology-based

Multi-partite network analytics

Crawling is resource-intensive

Human factor modeling

Semantic relatedness difficult to evaluate

Multi-partite Network AnalyticsTerm Suggestion and Query Expansion

Document-based

Our Challenge

Page 15: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University6/3/12 15

Wikipedia as Ontology

Page 16: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Query

Contributor Expertise Analysis

Optimization

Relative Importance Ranking

Visualization Interface

Evaluation Interface

Ontology Data Sampling

Semantic Relatedness Weighting

Page 17: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

C C

C:contributors T:Terms

T

T

C C

Key Term

T

C C

T

C C

L

L L

L

L:Categories

Layer by layer

Page 18: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Query

Contributor Expertise Analysis

Optimization

Relative Importance Ranking

Visualization Interface

Evaluation Interface

Ontology Data Sampling

Semantic Relatedness Weighting

Page 19: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Page 20: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Query

Contributor Expertise Analysis

Optimization

Relative Importance Ranking

Visualization Interface

Evaluation Interface

Ontology Data Sampling

Semantic Relatedness Weighting

Page 21: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Page 22: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Contributor Expertise factor

Expertise inference

Expertise

Contributor to contributor

Contributor to categories

Term to categories

Term to Term

Page 23: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Query

Contributor Expertise Analysis

Optimization

Relative Importance Ranking

Visualization Interface

Evaluation Interface

Ontology Data Sampling

Semantic Relatedness Weighting

Page 24: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

High Semantic Relatedness Term Suggestion from Our System

Page 25: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Word-completion Term Suggestion

Page 26: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

P@1 P@5 S@5 S@20 MRR

Simple link 0.3736 0.3039 0.6017 0.6231 0.4023

+Contributor 0.6151 0.3917 0.8031 0.8116 0.4125

+Expertise 0.6693 0.4412 0.8297 0.9620 0.5919

Performance Comparison for Different Relationship Levels.Using BibSonomy Dataset

Experiment I

Page 27: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Wordnet Bag of words Our algorithm

Literature 62.0% ± 5% 62.7% ± 4% 76.8% ± 6%

Natural science 60.7% ± 4% 65.6% ± 6% 73.3% ± 3%

Sociology 72.1% ± 5% 62.9% ± 5% 72.5% ± 7%

Business 60.4% ± 6% 58.5% ± 8% 67.1% ± 7%

Law 52.2% ± 9% 50.4% ± 8% 66.3% ± 6%

Engineering 54.0% ± 6% 68.3% ± 5% 66.2% ± 4%

Electrical & Computer Eng.

77.0% ± 4% 68.0% ± 3% 82.3% ± 3%

Life Science 73.1% ± 6% 70.9% ± 6% 81.4% ± 7%

Agriculture 72.6% ± 5% 65.1% ± 6% 72.3% ± 5%

Medical 63.0% ± 8% 65.6% ± 7% 61.6% ± 8%

ODP-based precision evaluation results increase 12.5% in average

Experiment II – Accuracy on different categories

Page 28: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University

Synonyms Hyponymy Antonyms Paraphrase

Zhao et al. - - - 0.7444

Our approach 0.2197 0.3665 0.2313 -

Precision Comparison With Paraphrase Detection System

82% of the suggested terms are reported as related, i.e., synonyms (22%), hyponyms (37%) or antonyms (23%)

Page 29: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University29 E6885 Network Science – Lecture 11: Knowledge Graphs

References

Jyh-Ren Shieh,  Ching-Yung Lin, Shun-Xuan Wang, Ja-Ling Wu, “Relational Term-Suggestion Graphs Incorporating Multi-Partite Concept and Expertise Networks,” ACM Transactions on Intelligent Systems and Technology (2012).

Jyh-Ren Shieh, Ching-Yung Lin, Shun-Xuan Wang, Ja-Ling Wu, “ Building Multi-Modal Relational Graphs for Multimedia Retrieval,”   International Journal of Multimedia Data Engineering and Management (IJMDEM): pp. 19-41 (2011). Best paper award nomination.

Jyh-Ren Shieh, Yung-Huan Hsieh, Yang-Ting Yeh, Tse-Chung Su, Ching-Yung Lin, Ja-Ling Wu, “Building term suggestion relational graphs from collective intelligence,” World Wide Web Conference (WWW 2009) pp. 1091-1092 (2009).

Jyh-Ren Shieh, Yang-Ting Yeh, Chih-Hung Lin, Ching-Yung Lin and Ja-Ling Wu, “Using Semantic Graphs for Image Search,” IEEE International Conference on Multimedia & Expo (ICME 2008), pp. 105-108 (2008).

Page 30: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Part-based Object Detection by Learning Random Attributed Graphs

Ref: DQ Zhang and SF Chang, “Detecting image near-duplicate by stochastic attributeed relational graph matching with learning”, ACM MM 2014.

Page 31: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Problem 1 : Object Detection and Part Identification

b. Where are the object parts ?

a. Does the input image contain the specified object ?

Page 32: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Problem 2 : Learning Part-based Object Model

Automatically learn the structure and parameters Minimum supervision : no object location and part location

Page 33: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Prior Work on Part-based Object Detection

Model with Hand-builtstructure

Model with learned structure and part statistics

MRF model,[Li 94’]

Constellation Model, [Burl, Weber, Fergus, Perona, Caltech, Oxford 98’-04’]

AdaBoost, [Viola & Jones, 01’]

Pictorial structure,[Felzenszwalb & Huttenlocher 98’]Elastic Bunch Graph,[Wiskott et. al 97’]

This new model : Graph-based representation; Can handle multi-view object detection

Model withoutspatial structure

Page 34: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Part-based Representation of Visual Scene

ARGVisual scenes are considered as the composition of the parts with certain spatial/attribute relations, modeled as Attributed Relational Graph (ARG)

==??

ARGSimilarity

IND Detectionas Computing ARG similarity

Attributed Relational Graph (ARG)

Part

Partrelation

Page 35: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

ARG based on Interest Point Detection

Region-based representation had very bad performance ! Interest point detector: SUSAN (Smallest Univalue Segment Assimilating Nucleus) corner detector Local features at vertexes

Spatial location, Color, Gabor filter coefficients Part relational features at edges

Spatial coordinate difference

Page 36: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Stochastic Framework for ARG Similarity

H: Hypotheses: H = 1, Graph t is similar to Graph s H = 0, Graph t is not similar to Graph s

VertexCorrespondence

Attribute Transformation

ARG similarity is the likelihood or likelihood ratio of the stochastic process that transforms source ARG to target ARG

ARG s ARG tsY tY

Stochastic Process that Transforms ARG s to ARG t

Page 37: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Non-linear Scene Transformation

Model Occlusion of objectsAddition of objects

Attribute Transformation

Graph s Graph tsY tY

Scene changes: object movement, occlusion etc. Camera changes: view point

change, panning etc Photometric changes: Lighting etc. Digitization changes: Resolution, gray scale etc.

Model Object appearance change,Object move,Photometric change

VertexCorrespondence

Page 38: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Generative Model of the Stochastic Transformation Process

X },...,,{ 321211 xxx

Graph s

Graph t

H: Hypothesis

H=1 : two graphs are similar

H

X : Correspondence Matrix

1

2

3

1

2

11x

32x

GraphS

Grapht

Product Graph

Page 39: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Transformation Likelihood

Prior MRF for constraints

Conditional density for attribute transformation

Transformation Likelihood

1

2

3

1

2

11x

32x

GraphS

Grapht

1

2

3

1

2

11x

32x

GraphS

Grapht

0),( 121112,11 xx

1),( 221122,11 xx

Transformation likelihood is:

Page 40: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Learning to Match ARGs

Feature point level learning: Label every feature point pairs

Image level learning: Label duplicate pairs and non-duplicate pairs Use Variational Expectation-Maximization (E-M)

Vertex-level annotation

Positive Samples

Negative Samples

Page 41: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Experiments and Results

Data set Images are picked up from TREC-VID 2003 video frames (partly based on TDT2 topic detection ground truth) 150 duplicate pairs, 300 non-duplicate images

Learning Training set: 30 duplicate pairs, 60 non- duplicate images Feature point level learning – 5 duplicate pairs, 10 non-duplicate images Image level learning – 25 duplicate pairs, 50 non-duplicate images

Feature pointlevel learning

Image-levellearning

Initial parameters

Final parameters

Page 42: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Compare with other similarity measures

(CH) HSV color histogram (LED) Local Edge Descriptor (AFDIP) Average feature distance of interest points (GRAPH) ARG matching with learning (GRAPH-M) ARG matching with manual parameter adjustment

Precision

Recall

Page 43: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia UniversityE6885 Network Science – Lecture 11: Knowledge Graphs

Summary of Part-based ARG Visual Modeling Algorithm

● Statistical part-based similarity measure performs much better than global color histogram and grid-based edge map

● Learning-based ARG matching not only save human cost,but also may give better performance

Page 44: E6885 Network Science Lecture 11cylin/course/netsci/NetSci... · Prior Work on Part-based Object Detection Model with Hand-built structure Model with learned structure and part statistics

© 2013 Columbia University44 E6885 Network Science – Lecture 11: Knowledge Graphs

Questions?