discovering scholarly orphans using orcid
TRANSCRIPT
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
Discovering Scholarly Orphans
Using ORCID
Martin Klein@mart1nkle1n
http://orcid.org/0000-0003-0130-2097
Herbert Van de Sompel@hvdsomp
http://orcid.org/0000-0002-0715-6126
Research Library
Los Alamos National Laboratory
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
2
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
3
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
4
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
5
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
6
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
7
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
8
Novel Archival Paradigm
• Current paradigm:
• Owner of scholarly record submits finalized and atomic record to
custodian, takes care of long-term preservation
• E.g., Publisher uploads journals to Portico, author uploads paper
into institutional repository
• Fails, even for traditional journal articles
• Significant number of journal articles do not make it into archives
• IRs are under-utilized
• Does not account for web-based scholarship, living things with
versions, web resources related to paper
Argument for a novel paradigm to capture web-based scholarly
resources
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
9
Capture Flow
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
10
Capture Flow
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
11
Algorithmic Discovery of Web Identities
James Powell et al. (2014) EgoSystem: Where are our alumni?
In: code4lib http://journal.code4lib.org/articles/9519
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
12
Capture Flow
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
13
Discovery of Web Identities via a Registry: ORCID
Ian Milliganhttp://orcid.org/0000-0002-1470-7723
Mark Matienzohttp://orcid.org/0000-0003-3270-1306
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
14
Mark Matienzo’s ORCID
• Web Identities: 3
(homepage, ScopusID,
ResearcherID)
http://orcid.org/0000-0003-3270-1306
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
15
Mark Matienzo’s Home Page
• URI to GitHub
repository, Twitter
• Could be included in
ORCID profile
http://matienzo.org/
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
16
Ian Milligan’s ORCID
• Web Identities: 0
http://orcid.org/0000-0002-1470-7723
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
17
• Evaluation of ORCID for automatic discovery of Web Identities
• How well does ORCID represent the global community of active
researchers?
• Adoption rate
• Subject coverage
• Geo-location coverage
• How well does ORCID score when it comes to listing Web Identities?
Discovery of Web Identities via a Registry: ORCID
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
18
ORCID data
Discovery of Web Identities via a Registry: ORCID
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
19
• Extract from ORCID records
• First name
• Last name
• Affiliations
• Works (publications, datasets, etc)
• Web identities
ORCID - Adoption Rate
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
20
ORCID - Adoption Rate
2013 2014 2015 2016
05
00000
100
0000
15000
00
20
00000
2500
000
ORCIDs total
ORCIDs with given names
ORCIDs with first names
ORCIDs with works
ORCIDs with affiliations
ORCIDs with web identities
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
21
• Extract DOIs from works
• Match DOIs against CrossRef’s Metadata API
• Obtain subject terms
• Match against descriptive terms from “Classification of Instructional
Programs” (CIP) published by the Institute of Education Sciences
ORCID - Subject Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
22
ORCID - Subject Coverage
2013
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
23
ORCID - Subject Coverage
Changes from 2013 to 2014
Ranks gained:
• Social Science
• Education
• History
Ranks lost:
• Computer Science
• Legal professions
• Journalism
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
24
ORCID - Subject Coverage
Changes from 2014 to 2015
Ranks gained:
• Social Science
• Education
Ranks lost:
• Natural Resources and
Conservation
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
25
ORCID - Subject Coverage
Changes from 2015 to 2016
Ranks gained:
• Natural Resources and
Conservation
Ranks lost:
• Multi/Interdisciplinary Studies
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
26
Comparison of ORCID subjects with:
1. Distribution of researchers’ disciplines
• Proxy: Ph.D. recipients from U.S. universities
• Obtained from NSF, 2015 data
2. Distribution of publications’ disciplines
• Obtained from UNESCO Science Report
• U.S. data from 2014
Both report disciplines aligned with CIP terms, hence they are
easily comparable.
ORCID - Subject Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
27
ORCID - Subject Coverage
0
10
20
30
40
50
60
Other
Life Sciences
Physical
Sciences
Mathematics and
Computer Sciences
Education
Psychology and
Social Sciences
Engineering
Humanities and Arts
●
●
●
● ●●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●●
●
●
●●●
ORCID Subjects
Ph.D. Researchers
Publications
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
28
• Extract affiliations from ORCID records
• Aggregate country code for associated locations
• Only available in ORCID data since 2015
• Compare against UNESCO data of researcher distribution
ORCID – Geo-Location Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
29
ORCID - Geo-Location Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
30
ORCID - Geo-Location Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
31
ORCID - Geo-Location Coverage
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
32
• Analyze distribution of link “Labels”
• Field lacks controlled vocabulary
ORCID – Web Identities
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
33
ORCID - Web Identities
Top 20 labels 2016
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
34
ORCID - Web Identities
Top 20 labels 2016
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
35
Capture Flow
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
36
Ian Milligan’s ORCID
• Artifacts?
http://orcid.org/0000-0002-1470-7723
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
37
• Analyze distribution of types of “Work” e.g.,
• “journal article” – likely not an orphan
• “data-set” – potential orphan
ORCID - Scholarly Orphans
https://members.orcid.org/api/resources/work-types
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
38
ORCID - Work Types
Dominated by types expected not to be orphans!
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
39
Take-Aways
• ORCID Adoption rate is increasing
• Subject coverage is focused, does not cover all disciplines equally
• Geo-Location coverage is good but not quite representative
• Web Identity coverage is poor; not usable for our purpose in its
current state
• Very few scholarly orphans directly referenced
Discovering Scholarly Orphans Using ORCID
@mart1nkle1n, @hvdsomp
JCDL 2017, 06/22/2017, Toronto, CA
Discovering Scholarly Orphans
Using ORCID
Martin Klein@mart1nkle1n
http://orcid.org/0000-0003-0130-2097
Herbert Van de Sompel@hvdsomp
http://orcid.org/0000-0002-0715-6126
Research Library
Los Alamos National Laboratory