the metadata [r]evolution: transformative opportunities september 18, 2013 · re-ingest temporal...
TRANSCRIPT
![Page 1: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/1.jpg)
The Metadata [R]evolution: Transformative Opportunities September 18, 2013
Presented by
![Page 2: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/2.jpg)
Using VIVO, Scopus, and PubMed to disambiguate Weill Cornell authors Paul Albert [email protected] Weill Cornell Medical College
![Page 3: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/3.jpg)
![Page 4: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/4.jpg)
![Page 5: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/5.jpg)
![Page 6: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/6.jpg)
![Page 7: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/7.jpg)
Original approach for managing faculty publications: rely on researchers or their proxies to manually enter publications.
![Page 8: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/8.jpg)
Does this work?
![Page 9: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/9.jpg)
![Page 10: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/10.jpg)
![Page 11: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/11.jpg)
![Page 12: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/12.jpg)
![Page 13: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/13.jpg)
Researchers’ response to email requesting copy of CV
![Page 14: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/14.jpg)
Why don’t our researchers care?
Failing to rigorously maintain an accurate list of publications is a rational choice. Time spent on maintaining publications bears a perceived, but more often real, opportunity cost.
![Page 15: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/15.jpg)
Revised approach for managing faculty publications: use data from Scopus and PubMed to maintain profiles for them
![Page 16: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/16.jpg)
Our publication ingest workflow 1. Librarian formulates queries. Stores in Google Doc.
Developer queries Scopus API and translates result into XML. Use DOI and PMID to lookup record in PubMed.
2. Combine metadata from both sources as a candidate for ingest.
3. If duplicate, disregard. If new, ingest. 4. Re-ingest temporal data such as citation count.
![Page 17: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/17.jpg)
What is ingested from where? Scopus • Full Author Names • Article Title • Journal Title • DOI • PMID (PubMed Identifier) • Date of publication • ISSN • Citation count
PubMed • Abstract • Medical Subject Headings (MeSH) • Funding • PubMed Central Identifier • Status (e.g., in process) • Second ISSN • Language • Journal abbreviation • Publication type
![Page 18: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/18.jpg)
A key consideration: will a publication ingest be institution-centric or person-centric?
![Page 19: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/19.jpg)
![Page 20: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/20.jpg)
![Page 21: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/21.jpg)
Query by institution
• Easier to identify hits
• Easier for institutional reporting, especially year to year comparisons
• Assertions of co-author identity can be unclear
Query by person
• More laborious – need an internal source for people
• Often accounts for publications w/ no or incorrect affiliation
• Accounts for previous affiliations
Affiliation ID = “Weill” Author ID = “8256757” x 1300
![Page 22: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/22.jpg)
Scopus commits two varieties of disambiguation errors
Splitting - one person, multiple author IDs; relatively easy to recover from
Lumping - multiple people, one author ID; relatively hard to recover from
![Page 23: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/23.jpg)
![Page 24: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/24.jpg)
![Page 25: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/25.jpg)
Ideal – one-to-one relation between
Scopus author ID and person
n=369
Splitting – more than one author
ID per person
n=707
Lumping – more than one person
per author ID
n=86
Both errors n=23
How accurate is Scopus at author disambiguation c. 2013? Gold standard = librarian judgment
![Page 26: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/26.jpg)
Two author disambiguation methods against a gold standard
Name query
Scopus
From Johnson et al. Submitted. “Automatic generation of investigator bibliographies for institutional research networking systems.”
![Page 27: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/27.jpg)
“Special queries” can compensate for lumping errors
![Page 28: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/28.jpg)
Examples of special queries
(AU-ID(7405920800)) AND (AF-ID(60007997) OR AF-ID(60009470) OR AF-ID(60019868))
(AU-ID(7402763146)) AND (AF-ID(60007997) OR AF-ID(60019868) OR AF-ID(60018043) OR AF-ID(60007997) OR AF-ID(60019868) OR AF-ID(100366692) OR AF-ID( 60018043) OR AF-ID(60002339) OR AF-ID(60009343) OR AF-ID(60024541) OR AF-ID(60025843) OR AF-ID(60027565))
![Page 29: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/29.jpg)
How can VIVO data address pressing institutional needs in order to strengthen its viability?
![Page 30: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/30.jpg)
NIH Open Access policy compliance WCMC authors who have received NIH funding but haven’t deposited pre-prints in PubMed Central receive a nastygram personalized notice.
0.78 0.81 0.84 0.87
0.9
Mar Apr May Jun Jul Aug Sep
![Page 31: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/31.jpg)
Co-author network and expertise of arbitrary group of faculty
![Page 32: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/32.jpg)
Suggested publications in annual faculty review tool
![Page 33: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/33.jpg)
Administrators are avid consumers of institutional data.
![Page 34: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/34.jpg)
Administrators want reporting tools (especially about publications) that are:
• Have current data • Easy to use • Allow for sophisticated queries
![Page 35: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/35.jpg)
VIVO Dashboard now under development
![Page 36: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/36.jpg)
Expertise recommendation tool also under development
![Page 37: The Metadata [R]evolution: Transformative Opportunities September 18, 2013 · Re-ingest temporal data such as citation count. What is ingested from where? Scopus ... disambiguation](https://reader034.vdocument.in/reader034/viewer/2022050406/5f83756cebd314047470773e/html5/thumbnails/37.jpg)
Acknowledgements
Eliza Chan and Prakash Adekkanattu - developers at Weill Cornell
Don Carpenter and Zeheng Wang - VIVO Dashboard developer
Jie Lin - Expertise Recommendation Tool developer
Drew Wright - publications help and NIH Access Policy compliance