ilya ponomarev 1 , pawel sulima 1 , jodi basner 1 , unni jensen 1 , joshua schnell 1 ,

22
Ilya Ponomarev 1 , Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 , Karen Jo 2 , and Nicole Moore 2 A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary Collaborations for Grant programs ilya.ponomarev@thomsonreuters .com 1 Custom Analytics, Rockville, MD 2 National Cancer Institute, Bethesda, MD 10/16/2013 5:30 PM

Upload: jenaya

Post on 15-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary Collaborations for Grant programs. Ilya Ponomarev 1 , Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 , Karen Jo 2 , and Nicole Moore 2. [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Ilya Ponomarev1, Pawel Sulima1, Jodi Basner1, Unni Jensen1, Joshua Schnell1, Karen Jo2, and Nicole Moore2

A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary

Collaborations for Grant programs

[email protected]

1Custom Analytics, Rockville, MD2National Cancer Institute, Bethesda, MD

10/16/2013 5:30 PM

Page 2: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Why Cross-disciplinary Research?

2

“Interdisciplinary research can be one of the most productive and inspiring of human pursuits”

Facilitating Interdisciplinary ResearchNational Academy of Sciences, 2005

• Innovation increasingly occurs at the boundaries of disciplines• Complex “Puzzles” require diverse background• Data avalanche from multiple sources requires fusion of information• Convergent technologies require integration across disciplines

Page 3: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

US Government Funding of Cross-disciplinary R&D

3

DODDOENSFNIHNASA

Page 4: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

How to Measure Success of Cross-disciplinary Program?

THIS TALK:

1.In order to measure cross-disciplinarity define disciplines as accurate as possible

2.General approach of automatic assigning grant specific categories to papers and people

3.Application to NCI PS-OC grant program classification?

4

See also J. Basner, “Evaluating Collaboration and Outcomes of Health Research” Friday, 10/18/2013, 11:00am at Gunston East Rm

Page 5: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

NCI Physical Sciences-Oncology Centers

5

12 centers, 250 Researchers

09/2009-Current

Institute Facilitate Generate

Page 6: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Evaluation: Birds View1. Use publications as a proxy of outcome

6

2006-2008:3,367 pubs

2009-2012:601 reported pubs

2. Compare baseline data set (2006-2008) with ongoing research data set (2009-2012) Web of Science+ Medline

166 active PS-OC investigators202,000 references4,199 journal titles

productivity impactcollaborationFields convergence

J. Basner, Friday, 10/18/2013

Page 7: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Evaluation: Birds ViewApproach:

7

PS-OC 2/3 broad categories

Onc

olog

y

Phy

sica

l Sci

ence

s

Life

Sci

ence

s

Page 8: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

PS-OC 3 broad categories

Onc

olog

y

Phy

sica

l Sci

ence

s

Life

Sci

ence

s266 Web of Science Journal Subject Categories

8

Has Oncology SCMultiple SCs per journals (up to 7) Multidisciplinary (meaningless, but “Science”, “Nature”)Some SCs are already inter-disciplinaryLSs dominates after aggregation

Page 9: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

22 ESI Subject Categories

9

One SC per journalDoes not have Oncology Multidisciplinary SC exists alsoClinical medicine?LSs dominates after aggregation

Page 10: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Mapping. Challenges

Approach:

1.Intermediate map on extended 6 Broad Categories

2.Paper level SC assignment based on references 10

PS-OC 3 broad categories

Onc

olog

y

Phy

sica

l Sci

ence

s

Life

Sci

ence

s

Web of Science 266 Journal SCs

Web of Science 22 Broad ESI categories

One SC per journalDoes not have Oncology Multidisciplinary SC exists alsoClinical medicine?LSs dominates after aggregation

Has Oncology SCMultiple SCs per journals MultidisciplinarySome SCs are inter-disciplinaryLSs dominates after aggregation

Page 11: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 1. Introduce 6 Intermediate PS-OC Categories for Better Selection:

11

PS – Physical Sciences

LS – Life Sciences

OC – Oncology

MED – Medicine

OTH – Others

MULT – Multidisciplinary

11

(very often MED journals are closer to ON than LS)

Will be dropped on final stage

Page 12: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 2. Map 265 WoS JSC to 6 PS-OC Categories:

12

Examples:

a) Obvious: Acoustics PS, Chemistry, Analytical PS

Oncology OC, Management OTH

b) Dominant: Biophysics PS

c) Dominant: Physics, Multidisciplinary PS

d) Meaningless: Multidisciplinary MULT(usually published in “Nature”, “Science” or “PNAS”)

Meaningless in terms of assignment PS-OC category: article published in MULT journal can be about PS, or about LS, or OC. Usually, it is not interdisciplinary article. Additional re-classification of article’s research field is needed based on references.

Page 13: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 3. Assign PS-OC Categories Weights to Each Journal

13

(Journals in WoS can have 1 or 2, or 3, … even 7 SCs)

Examples:

Journal “Radiation Research” – 3 SCs:

Biology LSBiophysics PSRadiology, NM PS

LSPS

Map Select distinctPS-OC categories

2

Count total (denominator) Weights)

LS=1/2PS=1/2OC=0MED =0MUL=0OTH=0

Each journal should be counted equally

Page 14: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 4. Calculate combined J-R weights for publications:

14

Example:

Coffey D., Getzenberg R. JAMA, 2006 1 journal cat (MED=1) 26 Refs:

14

Journal weights Aver. Refs Weights

LS=0PS=0MED=1OC=0MUL=0OTH=0

LS=0.23PS=0.04MED=0.17OC=0.36MUL=0.19OTH=0

½ (Journal + Refs)

LS=0.12PS=0.019MED=0.58OC=0.18MUL=0.1OTH=0

Better assignment of paper’s field based oninformation what paper cites

Page 15: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 5. Collect all publications for each investigator, calculate average weights, and rank PS-OC categories:

15

Example.

David A 8 pubs: Average JR weights

Averaged J-R Weights

LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01

Person Inter-disciplinarity

LS =2PS =4MED=3OC =1OTH =5

Ranks

3

Page 16: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Step 6. Redistribute MED and OTH weights between OC,LS, and PS

16

LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01

LS =0.4PS =0.05OC =0.55

Page 17: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Validation

17

At the beginning of the program: Investigators self-nominated themselves as oncologists or physicists

Page 18: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Applications: how publication patterns change

18

Page 19: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Future Development

19

19

Physical Scientist

Oncologist

Life Scientist

PS-OC Network Investigators Outside Network Co-authors

Page 20: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

Conclusions

20

• Automated approach for decomposition of scientific publications into grant specific discipline categories

• Multi-step method with intermediate mapping• Weighted SC assignment based on article’s and its references’ SCs• Precision-recall validation based on investigators’ self-

categorizations• Oncologists within the NCI’s PS-OC program are publishing more

physical sciences research and physical scientists are publishing more oncology or life sciences research during years of program participation.

Page 21: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

[email protected]

Thomson ReutersCustom Analytics

Rockville, MD

Page 22: Ilya Ponomarev 1 ,  Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 ,

SUPPORTING SLIDES