the linked data snowball and why we need reconciliation · 2014-12-05 · the linked data snowball...

34
The Linked Data Snowball and Why We Need Reconciliation December 1 st , 2014 T HE A NDREW W. M ELLON F OUNDATION W ORKSHOP ON R ECONCILIATION OF L INKED O PEN D ATA Rob Sanderson / [email protected] / @azaroth42

Upload: others

Post on 26-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

The Linked Data Snowball andWhy We Need Reconciliation

December 1st, 2014

T H E A N D R E W W. M E L L O N F O U N D AT I O N

W O R K S H O P O N R E C O N C I L I AT I O N O F L I N K E D O P E N D ATA

Rob Sanderson / [email protected] / @azaroth42

Page 2: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

The Linked Data Snowball andWhy We Need Reconciliation

December 1st, 2014

T H E A N D R E W W. M E L L O N F O U N D AT I O N

W O R K S H O P O N R E C O N C I L I AT I O N O F L I N K E D O P E N D ATA

Rob Sanderson / [email protected] / @azaroth42

web.stanford.edu/~azaroth/#me

[email protected] / +azaroth42

orcid: 0000-0003-4441-6852

Page 3: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

The Linked Data Snowball andWhy We Need Reconciliation

December 1st, 2014

T H E A N D R E W W. M E L L O N F O U N D AT I O NW O R K S H O P O N R E C O N C I L I AT I O N O F L I N K E D O P E N D ATA

Rob Sanderson / [email protected] / @azaroth42

web.stanford.edu/~azaroth/#me

[email protected] / +azaroth42

orcid: 0000-0003-4441-6852

http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert

http://academic.research.microsoft.com/Author/2765999

http://www.scopus.com/authid/detail.url?authorId=8988953600

www.researchgate.net/profile/Rob_Sanderson

facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/

[email protected] / [email protected]

public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth

[email protected] / [email protected]

Page 4: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

A Brief Survey of Linked Open Data

http://lod-cloud.net/ as of Aug 2014

Page 5: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Some Highlights

Libraries:

BNF, DNB, BL, LoC, KB, ...

Archives:

SNAC, LOCAH, Medici Archives, ...

Museums:

BM, YCBA, vu.nl, Smithsonian, Getty, AAC, ...

Consortia:

Europeana (+), TEL, RLUK, DPLA, ...

Government:

data.gov, data.gov.uk, legislation.gov.uk, ...

Companies:

OCLC, Google, IBM, New York Times, ...

Page 6: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Lots of Adoption = Lots of URIs

Page 7: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Lots of Adoption = Lots of URIs

For the Same Thing :(

Page 8: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION
Page 9: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it?

URI

No

Page 10: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

URI

Page 11: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

Understand and agree with the description?No

URI

Page 12: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

Understand and agree with the description?No

Agree the URI identifies the same entity?No

URI

Page 13: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

Understand and agree with the description?No

Agree the URI identifies the same entity?No

Agree description is complete?No

URI

Page 14: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

Understand and agree with the description?No

Agree the URI identifies the same entity?No

Agree description is complete?No

Hooray, you reused a URI! URIYes

Page 15: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Why So Many?

Do I know the URI, or can I find it? No

Understand and agree with the model used?No

Understand and agree with the description?No

Agree the URI identifies the same entity?No

Agree description is complete?No

Hooray, you reused a URI!Now start again with the next one :(

URIYes

Page 16: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Many Special and Unique Snowflakes

Page 17: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Become a Huge Technical Debt Snowball

Page 18: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 1: Balance the Equation

Cost(Create URI)

+

Cost(Maintain URI)

Cost(Find Good URI) +

Cost(Understand Model) +

Cost(Understand Content)

+

Cost(Network Latency)

+

min( Risk(Reliability),

Cost(Cache Content) )

-

Value(Linking Graph)

<=

Page 19: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 1 Likelihood

Page 20: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation of URIs

Stanford's URIs British Library's URIs

Page 21: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation of URIs

Stanford's

Entities

British

Library's

Entities

Shared Entities without Shared URIs

Page 22: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation Process

Discover this intersection given the descriptions of the entities

Page 23: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation Process

Best sort of engineering problem:

• Easy to explain

• Helps many organizations at once

• Provides significant value and utility

Page 24: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation Process

Best sort of engineering problem:

• Easy to explain

• Helps many organizations at once

• Provides significant value and utility

• Difficult to solve

Page 25: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Option 2: Reconciliation Process

Best sort of engineering problem:

• Easy to explain

• Helps many organizations at once

• Provides significant value and utility

• Difficult to solve

But:

• Requires community adoption of the results

Page 26: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Current Community

Page 27: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Expectation Management is Important

Or at best:

Page 28: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Top Three Questions to Answer(according to Rob)

Which sorts of entities should this community reconcile?

How can we share the engineering internationally?

How do we ensure future usage of the reconciled entities?

Page 29: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thoughts: Entities to Reconcile

Start with least controversial and most unique

• Unique physical objects

• People

• Places

Must generate consensus around identity within the LAM

community.

Must focus on unique selling points – how can we be more

useful than DBPedia for our own entities?

Page 30: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thoughts: Shared Engineering

Let a thousand snowflakes fall ...

... then build the best snowball possible.

• Solve small, manageable problems well

• Interoperability between platforms: plug and play

• Communicate continuously

Focused projects that fit into a whole, leveraging the experts in

the appropriate domain.

Requires some degree of community structure and

management to ensure we're building off each other.

Page 31: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thoughts: Ensure Usage

Build consensus early and often

• Between institutions

• Within the LAM community

• Outside the LAM community

Who should be here that isn't? Lots of libraries, also need

input from Museums and Archives as they have more unique

entities.

We need to ensure that LAM use the reconciled entities,

which involves starting to balance the cost equation.

Page 32: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thank You!

December 1st, 2014

Rob Sanderson / [email protected] / @azaroth42

web.stanford.edu/~azaroth/#me

[email protected] / +azaroth42

orcid: 0000-0003-4441-6852

http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert

http://academic.research.microsoft.com/Author/2765999

http://www.scopus.com/authid/detail.url?authorId=8988953600

www.researchgate.net/profile/Rob_Sanderson

facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/

[email protected] / [email protected]

public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth

[email protected] / [email protected]

Page 33: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thank You!

December 1st, 2014

Rob Sanderson / [email protected] / @azaroth42

web.stanford.edu/~azaroth/#me

[email protected] / +azaroth42

orcid: 0000-0003-4441-6852

http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert

http://academic.research.microsoft.com/Author/2765999

http://www.scopus.com/authid/detail.url?authorId=8988953600

www.researchgate.net/profile/Rob_Sanderson

facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/

[email protected] / [email protected]

public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth

[email protected] / [email protected]

Page 34: The Linked Data Snowball and Why We Need Reconciliation · 2014-12-05 · The Linked Data Snowball and Why We Need Reconciliation December 1st, 2014 THE ANDREW W. MELLON FOUNDATION

Thank You!

December 1st, 2014

[email protected]