linked (data) scientometrics keynote

32
L S S W V P S L (D) S L S 2015 K Krzysztof Janowicz STKO Lab, University of California, Santa Barbara, USA L D S K. J

Upload: kjanowicz

Post on 23-Jan-2018

1.173 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Linked (Data) Scientometrics

Linked Science 2015 Keynote

Krzysztof JanowiczSTKO Lab, University of California, Santa Barbara, USA

Linked Data Scientometrics K. Janowicz

Page 2: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What is Linked Science?

What is Linked Science?

Scientific dissemination traditionally relies heavily on scholarly ar-ticles and presentations at conferences. However in the pastfew years, we have seen an increasing trend towards the publi-cation of raw research data to facilitate verification and reuse.Linked Science champions the process of publishing, sharingand interlinking scientific resources and data along with com-plete experiment context, which is critical for understanding, reusingand verifying scientific research. SemanticWeb technologies pro-vide a promising means for achieving this practice.

(From the Linked Science 2015 call)

What are the research questions of Linked Science, what are the bottlenecks?

Linked Data Scientometrics K. Janowicz

Page 3: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What are Scientometrics?

What are Scientometrics?

The field of scientometrics is concerned with measuring and analyzingthe impact of science in its broadest sense.

(Raw) data by examplePublicationsAuthorsAffiliationsKeywordsThemesFunding sourcesCitations...

What is meant by measuring and analyzing?

Linked Data Scientometrics K. Janowicz

Page 4: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What are Scientometrics?

Scientometrics Research Questions

Research questions by exampleSimple and boring

Number of papers at ISWC 2015Boring

Number of Papers by a specific W. Zhang in 2015Simple and interesting

What goes here?

InterestingIs the Semantic Web as a research area growing or shrinking?Are Linked Data and Semantic Web the same community?Are the research interests of a researcher changing?What are the new research trends in Artificial Intelligence?To which university should I go to study geo-semantics?Who are good reviewers for a certain paper?

Linked Data Scientometrics K. Janowicz

Page 5: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What are Scientometrics?

Scientometrics Research Questions

Research questions by exampleSimple and boring

Number of papers at ISWC 2015Boring

Number of Papers by a specific W. Zhang in 2015Simple and interesting

InterestingIs the Semantic Web as a research area growing or shrinking?Are Linked Data and Semantic Web the same community?Are the research interests of a researcher changing?What are the new research trends in Artificial Intelligence?To which university should I go to study geo-semantics?Who are good reviewers for a certain paper?

Linked Data Scientometrics K. Janowicz

Page 6: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What are Scientometrics?

Scientometrics Research Questions

Research questions by exampleSimple and boring

Number of papers at ISWC 2015Boring

Number of Papers by a specific W. Zhang in 2015

Should be Simple and interestingHow does a change in affiliations impact a researcher’s interests?Is there a relation between spatial proximity and citations?

InterestingIs the Semantic Web as a research area growing or shrinking?Are Linked Data and Semantic Web the same community?Are the research interests of a researcher changing?What are the new research trends in Artificial Intelligence?To which university should I go to study geo-semantics?Who are good reviewers for a certain paper?

Linked Data Scientometrics K. Janowicz

Page 7: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

What are Scientometrics?

Whyare interesting scientometrics questions not simple?

Linked Data Scientometrics K. Janowicz

Page 8: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Retrieval

Key Limitations: Data Retrieval

Even the major data hubs such as Data.gov still rely on keyword-based searchand have unreliable, incomplete, and missing metadata. For this type ofretrieval problems, even ‘a little semantics goes a long way’ (Hendler 1997).

Linked Data Scientometrics K. Janowicz

Page 9: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Sensemaking

Key Limitations: Sensemaking and Fitness for Purpose

There is no shortage of data, butfinding data that is fit for a certainpurpose is difficult.Data as statements not as truth,e.g., according to Springer I am atWSU not UCSB.Heterogeneity is caused by culturaldifferences, progress in science,viewpoints, ...; e.g., associateprofessor versus senior lecturerLack of provenance informationSensemaking requires morepowerful semantic technologies andontologies (compared to IR).

Linked Data Scientometrics K. Janowicz

Page 10: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Interoperability

Key Limitations: Meaningful Analysis and Synthesis

Ensuring that data is analyzed andcombined in a meaningful way is farfrom trivial.What if the information on how touse the data would come togetherwith these data?Focus on smart data instead of(merely on) smart applications.The purpose of ontologies is not toagree on the meaning of terms but tomake the data provider’s intendedmeaning explicit.

Linked Data Scientometrics K. Janowicz

Page 11: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Smart Data

The Smart Data Argument

One of the key arguments underlying the Semantic Web andLinked Data paradigms is to make data smart, not applications.Instead of developing increasingly complex software, theso-called business logic should be moved to the (meta)data.The rationale is that smart data will make all future applicationsmore usable, flexible, and robust, while smarter applicationsfail to improve data along the same dimensions.

(http://goo.gl/FMXOZT)Why the Data Train Needs Semantic Rails. (2015) K. Janowicz, F. van Harmelen, J. Hendler, P. Hitzler

Linked Data Scientometrics K. Janowicz

Page 12: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Semantics-Enabled Linked-Data-Driven Scientometrics

Howdoes this relate to scientometrics?

Linked Data Scientometrics K. Janowicz

Page 13: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Semantics-Enabled Linked-Data-Driven Scientometrics

Semantics-Enabled Linked-Data-Driven Scientometrics

Integrates data from a variety of sources, e.g., Semantic Web Dog Food, SWJ.Example: http://stko-exp.geog.ucsb.edu/lak/

Linked Data Scientometrics K. Janowicz

Page 14: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Semantics-Enabled Linked-Data-Driven Scientometrics

ISWC Installation Based on New Deployment Framework

http://scientometrics.geog.ucsb.edu/iswc/

Smart Data: first scientometrics installation (for SWJ) took months to develop anddeploy, now we are down to hours at least when leaving semantic lifting and datacleaning aside (!) and by using a reduced number of modules (8/30)

Linked Data Scientometrics K. Janowicz

Page 15: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Semantics-Enabled Linked-Data-Driven Scientometrics

Value Proposition

Why do we use Semantic Web and Linked Data for Scientometrics

Federated queries over multiple data sourcesUnique global identifiers easy conflation and deduplicationTransparent data model; reduces the need for guessingNo data silos, no API restrictionsMany pre-defined lightweight vocabularies (ontologies)Smart data reduces the need for smart applicationsMachine reasoning support

So do we still need a deeper knowledge representation beyondsurface semantics?

Linked Data Scientometrics K. Janowicz

Page 16: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Timeline

Keyword frequency for Semantic Web; WWW conference series (1994-2013)http://stko-exp.geog.ucsb.edu/web25portal/index.html

Linked Data Scientometrics K. Janowicz

Page 17: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Timeline

Keyword frequency for Linked Data; WWW conference series (1994-2013)http://stko-exp.geog.ucsb.edu/web25portal/

Linked Data Scientometrics K. Janowicz

Page 18: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

An Interesting Question

An Interesting Question

Given the keyword timeline, is the Semantic Web as a research fielddisappearing, diversifying, radiating, ...?

Linked Data Scientometrics K. Janowicz

Page 19: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

An Interesting Question

Letthe data speak for themselves

Linked Data Scientometrics K. Janowicz

Page 20: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Community detection

Colors: community membership, node size: frequency, line width: co-occurrence strength

Linked Data Scientometrics K. Janowicz

Page 21: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Community detection

Colors: community membership, node size: frequency, line width: co-occurrence strength

Linked Data Scientometrics K. Janowicz

Page 22: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Self-organizing Map

Landscape analogy: counties, mountains, and valleysLinked Data Scientometrics K. Janowicz

Page 23: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Self-organizing Map

Landscape analogy: counties, mountains, and valleysLinked Data Scientometrics K. Janowicz

Page 24: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Mapping

Location of top institutions that published on Semantic Web between 2009-2013.

Linked Data Scientometrics K. Janowicz

Page 25: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Mapping

Similar pattern for Linked Data keyword between 2009-2013.

Linked Data Scientometrics K. Janowicz

Page 26: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

Web@25

Web@25 Installation: Mapping

Dissimilar pattern for Search Engine keyword between 2009-2013.

Linked Data Scientometrics K. Janowicz

Page 27: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

A Tale of Three Papers

Three Papers That Shaped the Semantic Web

Citations peaked 2009 for the Ontology and Semantic Web papers.More interestingly, why would you still cite these papers today?http://stko-testing.geog.ucsb.edu/ios/

Linked Data Scientometrics K. Janowicz

Page 28: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

A Tale of Three Papers

Three Papers That Shaped the Semantic Web

Top keywords: {Ontology, SW},{Semantic Web, Ontology}, {Linked Data, Semantic Web, Ontology}

Linked Data Scientometrics K. Janowicz

Page 29: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

A Tale of Three Papers

Three Papers That Shaped the Semantic Web

If a paper makes impact beyond its own home community, we should see anincrease in keyword variability (entropy).

Linked Data Scientometrics K. Janowicz

Page 30: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

And Now?

Sois the Semantic Web

disappearing, diversifying, radiating,...?

Linked Data Scientometrics K. Janowicz

Page 31: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

And Now?

Sois the Semantic Web

disappearing, diversifying, radiating,...?

No amount of data analytics is going to answer such questions beforewe precisely define and communicate what we mean by those terms.

Linked Data Scientometrics K. Janowicz

Page 32: Linked (Data) Scientometrics Keynote

Linked Science Semantic Web Value Proposition Scientometrics

And Now?

Where Do We Go From Here?

Using Linked Data, ontologies, and basic reasoning capabilities, allows us torapidly deploy scientometrics installationsGetting basic bibliographic data into (or as) Linked Data is becoming atrivial taskConflation, data enrichment, lack of rich metadata remains a major problem.Discovering owl:sameAs links is just a subtask of conflation

Conflation race between academic publishers, libraries, ...Generate and enrich the data where it is created or first processedWe need a rich but simple ontology that goes beyond academic publishingbut includes the related processes and rolesRevive Semantic Web Dog Food; ISWC really needs better metadata!These slides and existing scientometrics systems are about embarrassinglysimple analysis, everything else will needs substantially stronger conceptualmodels and machinery (combining inductive & deductive methods)

Linked Data Scientometrics K. Janowicz