semantic labeling a domain-independent...

38
Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future W Semantic labeling A domain-independent approach Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California October 21, 2016 Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California Semantic labeling A domain-independent approach

Upload: others

Post on 30-Apr-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Semantic labelingA domain-independent approach

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely

Information Science InstituteUniversity of Southern California

October 21, 2016

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 2: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

How can we integrate data ?

SEMANTIC LABELING

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 3: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

How can we integrate data ?

SEMANTIC LABELING

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 4: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

What is Semantic Labeling ?

full name position height

Hazard, Eden Midfielder 172

Cahill, Gary Defender 191

Rooney, Wayne Forward 176

birthName position height

Player

Labeled source

fn ht ps

Alan PULIDO 176 cm FW

Robin VAN PERSIE 186 cm MF

Miiko ALBORNOZ 180 cm DF

? ? ?

?

Unlabeled source

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 5: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 6: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 7: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Domain-dependent approach: Training

player height

Hazard, Eden 172

Cahill, Gary 191

Rooney, Wayne 176

all caps True

# of chars 8

player-birthName

all caps False

# of chars 3

player-height

wallcaps 7

w#ofchars 6

player-birthName

wallcaps -7

w#ofchars 10

player-height

birthName height

Player

Classifier

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 8: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Domain-dependent approach: Predicting

player

Christiano Ronaldo

Sergio Ramos

Lionel Messi

all caps True

# of chars 10

P=0.1

player-height

P=0.8

player-birthName

?

?

Classifier

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 9: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Domain-dependent approach: Adding new attribute

full name height position

Hazard, Eden 172 MF

Cahill, Gary 191 DF

Rooney, Wayne 176 FW

all caps True

# of chars 10

player-birthName

all caps False

# of chars 3

player-height

all caps True

# of chars 2

player-position

wallcaps 3

w#ofchars 4

player-position

wallcaps -7

-10

w#ofchars 10

5

player-height

wallcaps 7

3

w#ofchars 6

8

player-birthName

Classifier

birthName positionheight

Player

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 10: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Domain-dependent approach: Adding new attribute

full name height position

Hazard, Eden 172 MF

Cahill, Gary 191 DF

Rooney, Wayne 176 FW

all caps True

# of chars 10

player-birthName

all caps False

# of chars 3

player-height

all caps True

# of chars 2

player-position

wallcaps 3

w#ofchars 4

player-position

wallcaps-10

w#ofchars 5

player-height

wallcaps 3

w#ofchars 8

player-birthName

Classifier

birthName positionheight

Player

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 11: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 12: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Requirements

Domain-independent learning models

Efficient and scalable framework

Need small amount of domain data as labeled data sources

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 13: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Our approach: Training

player height

Hazard, Eden 172

Cahill, Gary 191

Rooney, Wayne 176

name

Wayne Smith

Tim Cahill

Juan Mata

jaccard 0.4

cosine 0.5

True

jaccard 0.05

cosine 0.1

False

wjaccard 5.5

wcosine 7

birthName height

Player

birthName

Player

Source 1 Source 2

Classifier

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 14: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Our approach: Predicting

player height

Hazard, Eden 172

Cahill, Gary 191

Rooney, Wayne 176

?

Wayne Smith

Tim Cahill

Juan Mata

jaccard 0.4

cosine 0.5

jaccard 0.05

cosine 0.1

PTrue=0.1PTrue=0.8

player-birthName

birthName height

Player

?

?

Classifier

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 15: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Our approach: Adding new attribute

postion

Midfielder

Defender

Forward

position

Player

Data Storage

Store

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 16: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Classification models

Classification models:

Models with class probabilities for ranking scores.

Typical methods: Logistic Regression, Random Forest

Logistic Regression over Random Forest:

Better interpretation

Faster training time

Better class probabilities in ranking situation.

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 17: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 18: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Similarity features

Jaccard TF-IDF JaccardKolmogorov

SmirnovMann Whitney

Attributesnames similarity

Valuessimilarity

Distributionsimilarity

HistogramSimilarity

SimilarityFeatures

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 19: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Attribute name similarity

first name FirstName Address

... ... ...

... ... ...

... ... ...

Similar Not

similar

Similarity measure: Jaccard similarity

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 20: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Attribute name similarity

first name FirstName Address

... ... ...

... ... ...

... ... ...

Similar Not

similar

Similarity measure: Jaccard similarity

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 21: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Value similarity

Player name Name Club name

Gary Cahill Juan Quin Chelsea

Metsul Ozeil De Gea Real Madrid

Juan Mata Tim Cahill Barcelona

Similar Not

similar

Similarity measures: Jaccard similarity, TF-IDF cosine similarity

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 22: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Value similarity

Player name Name Club name

Gary Cahill Juan Quin Chelsea

Metsul Ozeil De Gea Real Madrid

Juan Mata Tim Cahill Barcelona

Similar Not

similar

Similarity measures: Jaccard similarity, TF-IDF cosine similarity

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 23: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Jaccard similarity for numeric values

Numeric Jaccard Similiarity

Given 2 numeric sets of values A,B ranged in [as , ae ] and [bs , be ]:

numJaccardSim(A,B) =|[as , ae ] ∩ [bs , be ]||[as , ae ] ∪ [bs , be ]|

0 20 40 60 80 100

overlap

set 1

set 2

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 24: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Distribution similarity

# game played # goal scored

4 3

... ...

18 11

23 22

Similar

Not

similar

Similarity measure: KS test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 25: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Distribution similarity

# game played # goal scored

4 3

... ...

18 11

23 22

Similar

Not

similar

Similarity measure: KS test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 26: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Distribution similarity

# game played # goal scored

4 3

... ...

18 11

23 22

Similar

Not

similar

Similarity measure: KS test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 27: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Distribution similarity

# game played # goal scored

4 3

... ...

18 11

23 22

Similar

Not

similar

Similarity measure: KS test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 28: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Histogram similarity

position ps

1 GK

4 MF

2 DF

4 FW

Not

similar

Similar

Similarity measure: MW test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 29: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Histogram similarity

position ps

1 GK

4 MF

2 DF

4 FW

Not

similar

Similar

Similarity measure: MW test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 30: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Histogram similarity

position ps

1 GK

4 MF

2 DF

4 FW

Not

similar

Similar

Similarity measure: MW test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 31: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Histogram similarity

position ps

1 GK

4 MF

2 DF

4 FW

Not

similar

Similar

Similarity measure: MW test

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 32: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 33: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Evaluation

Data sets:

Domain data # sources # semantic types # attributessoccer 12 14 97

museum 29 20 217

city 10 52 520

weather 4 11 44

T2D Gold 1748 7983 ?

Measurements: Mean Reciprocal Rank (MRR)Evaluating systems: DSL (our approach), SemanticTyper(Ramnandan et al, 2015), T2K (Ritze et al, 2015)

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 34: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Performance of DSL vs SemanticTyper

soccer museum city

0.4

0.5

0.6

0.7

0.8

0.9

1A

vera

geM

RR

DSL

SemanticTyper

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 35: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Performance of DSL (trained on different datasets)

soccer museum city weather

0.4

0.5

0.6

0.7

0.8

0.9

1A

vera

geM

RR

Train on soccer

Train on museum

Train on city

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 36: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Performance of DSL vs T2K on T2D Gold dataset

Experimental settings:

Labeled sources: DBpediadata in table format

DSL’s classifiers: trained onsoccer, museum and citydatasets

T2K results: training andtesting

T2D Gold

0.4

0.5

0.6

0.7

0.8

0.9

1

Ave

rage

MR

R

DSL

T2K (opt)

T2K (eval)

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 37: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Outline

1 Previous approach: domain-dependent

2 Our approach: domain-independent

3 Similarity features

4 Evaluation

5 Conclusion and Future Work

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach

Page 38: Semantic labeling A domain-independent approachusc-isi-i2.github.io/slides/pham16-iswc-slides.pdf · Alan PULIDO 176 cm FW Robin VAN PERSIE 186 cm MF Miiko ALBORNOZ 180 cm DF? ? ??

Previous approach: domain-dependent Our approach: domain-independent Similarity features Evaluation Conclusion and Future Work

Conclusion and Future Work

Conclusion:

Domain-independent approach

Scalable framework

Future Work:

Adjust classifier based on domain characteristic

Detect unseen semantic types in labeling phase

Minh Pham, Suresh Alse, Craig Knoblock, Pedro Szekely Information Science Institute University of Southern California

Semantic labeling A domain-independent approach