semantic scholar - icsti.org · publications and transactions from the association for computing...
TRANSCRIPT
Semantic ScholarICSTI
Towards a More Efficient Review of Research Literature11 September 2018
www.semanticscholar.org
Allen Institute for Artificial Intelligence (https://allenai.org/)
Non-profit Research Institute in Seattle, Founded by Microsoft Founder Paul Allen
AI2 launched Jan. 2014
Team of 50 June 2016
Team of 75 June 2017
Team of ~100 June 2018
“AI for the common good”
3
The number of scientific papers has doubled every nine years since World War II.*
Challenges for Researchers
*Source: What We Cannot Know. By Prof. Marcus du Sautoy.**The STM Report, March 2015
There are over 34,000 scholarly, peer-reviewed journals in existence today, collectively publishing some 2.5 million articles every year. It’s estimated that a single researcher, depending on their discipline, will read about 270** of them in the same time frame.
www.semanticscholar.org
4
Key Challenges for Researchers
Semantic Scholar User Research, 2016
Core tasks have become more difficult:
• Staying current with research
• Placing research in context
• Evaluating importance of research
www.semanticscholar.org
A.I.-driven approach to research
Make it easy to survey and consume the world's scholarly knowledge
Semantic Scholar
• 40MM+ Papers Indexed in CS and Bio-medicine
• Partners: IEEE, Springer Nature, PubMed, Science, MIT Press, ArXiv, +more
• Global Reach
User Challenge:
Key Results:
Semantic Scholar Approach:
• Search and discovery of Computer Science and Biomedicine
• Robust knowledge graph of papers, venues, topics, and authors
• Alerts on Authors, Papers; Reading library
Staying current with research
www.semanticscholar.org
Growth in indexed papers, partnerships
Springer Nature
Publisher of some of the world's most influential
science and technology journals.
PubMed
More than 27 million citations for biomedical papers.
Frontiers
Peer-reviewed open access journals in science and
technology.
ACM
Publications and transactions from the Association
for Computing Machinery.
IEEE
Leading organization for technology publishing.
ACL
Publications and transactions from the Association
for Computational Linguistics.
ArXiv
Open access to e-prints in Physics, Mathematics,
Computer Science, Quantitative Biology,
Quantitative Finance and Statistics.
DBLP
The online reference for open bibliographic
information on computer science journals and
proceedings.
CiteSeer
Scientific literature digital library and search
engine.
OdySci Academic
A website for searching and ranking technical
publications.
AMiner
Mining deep knowledge from scientific sources.
Hyper Articles en Ligne (HAL)
Open archive run by Centre pour la
communication scientifique directe, a French
computing centre
Science
Peer-reviewed academic journal of the American
Association for the Advancement of Science.
MIT Press
The MIT Press is a university press affiliated with
the Massachusetts Institute of Technology.
40MM+Papers indexed
Began with Computer Science, launched biomedicine in Fall 2017
www.semanticscholar.org
Alerts and Library
Growth in user traffic
• 1.7MM Monthly Active users• 300%+ year on year growth
• Globally Distributed
www.semanticscholar.org
Semantic Scholar
• Millions of matched Conference Presentations, Videos, Code Libraries, News Articles, and Scientific Blogs
• Relevant search results
User Challenge:
Key Results:
Semantic Scholar Approach:
• Augment with relevant additional content: Featured Presentations, Videos, Code Libraries, News, Blogs
• Automatic generation of topic summary pages with key papers
• Search relevance incorporates citations, citation influence
Placing research in context
www.semanticscholar.org
Placing Research in Context – Relevant external content
Currently available are millions of highly relevant:• Presentations• Videos• Academic Blogs • Code Repositories
Evaluating:• News Articles• Clinical trials • Twitter mentions• Mendeley references
www.semanticscholar.org
Placing Research in Context – Search Relevance & Filtering
Placing Research in Context – Search Relevance & Filtering
www.semanticscholar.org
Automatically generated Topic Pages
Semantic Scholar
User Challenge:
Key Results:
Semantic Scholar Approach:
• Parse 470k entities, 2.5MM relations, 335MM entity mentions
• Ongoing improvements to boost precision and recall
• Extract citations and build citation graph
• Extract abstract, charts, tables, and other metadata
• Models trained to extract numerical results, topics, and other semantics
Evaluating importance of research
www.semanticscholar.org
Evaluating Research: Influential citations
Evaluating Research: Tables, Charts, Metadata
www.semanticscholar.org
Evaluating Research: Tables, Charts, Metadata
Title: Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
Authors: Peter Clark, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Turney, Daniel Khashabi
Science Parse –Extracting metadata at high precision
www.semanticscholar.org
Authors: Clark, P.; Balasubramanian, N.; Bhakthavatsalam, S.; Humphreys, K.; Kinkead, J.; Sabharwal, A.; and Tafjord, O.
Title: Automatic construction of inference-supporting knowledge bases
Year: 2014
www.semanticscholar.org
Groundbreaking AI Research
Extract meaningful
structures
Examples:
● Entity extraction.
● Entity linking.
● Relation extraction.
● Answering FAQs.
● Entity discovery.
● Semantic frames.
● Figure extraction.
Macro analysis
Examples:
● Association between
prepublishing & citations.
● Identifying demographic
bias in clinical trials.
● Change of affiliations vs.
research productivity.
● Peer reviews.
Synthesize knowledge
Examples:
● Ontology matching.
● Literature graph.
● Table aggregation.
● Summarization.
● Citation classification.
● Topic page compilation.
● User models.
www.semanticscholar.org
Research Impact
Extract meaningful
structures
Ammar et al. SemEval’17
Peters et al. ACL’17
Siegel et al. JCDL’18
Ammar et al. NAACL’18
Beltagy et al. (in submission)
Macro analysis
Kang et al. NAACL’18
Feldman et al. arXiv’18
Synthesize knowledge
Wang et al. BioNLP’18
Bhagavatula et al. NAACL’18
www.semanticscholar.org
www.semanticscholar.org
www.semanticscholar.org