saint toolkit for applied scientometrics edwin horlings august 2012

22
SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

Upload: bryce-simpson

Post on 11-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

SAINT Toolkitfor Applied Scientometrics

Edwin Horlings

August 2012

Page 2: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Structure

• Applied scientometrics

• The SAINT Toolkit

• Examples of applications

• Collaboration

Edwin Horlings | 2 / 28 | Patenting in the Netherlands 1945-2011

Page 3: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

APPLIED SCIENTOMETRICS

Edwin Horlings | 3 / 28 | Patenting in the Netherlands 1945-2011

Page 4: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Applied scientometrics

• We are living in the age of Big Data− continuous increase in the amounts of data on science, technology and

innovation− Web of Science c. 45 million scientific publications, PATSTAT c. 67

million patent applications, with detailed information on every record− enormous expansion of web data (e.g. twitter, blogs)− we now have the computer power to mine and analyse those data

• Increasing call for evidence-based policy− support policy and politics with reliable information− applied scientometrics can help understand the effects of policy

Edwin Horlings | 4 / 28 | Patenting in the Netherlands 1945-2011

Page 5: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Applied scientometrics

Edwin Horlings | 5 / 28 | Patenting in the Netherlands 1945-2011

Applying advanced quantitative methodsto large heterogeneous datasets

in order to extract patternsthat show the structure and development of

science, technology and innovation

Page 6: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

What sort of patterns do we look for?

• Networks− co-authors of scientific papers− inventors and assignees of patents− members of the same assocations− researchers working on or talking about the same topics

• Clustering of similar items− publications about the same topic or in the same specialisation− similar patents by different organisations− clusters in collaboration networks

• Statistical analysis of patterns (e.g. Social Network Analysis)

Edwin Horlings | 6 / 28 | Patenting in the Netherlands 1945-2011

Page 7: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

SAINT TOOLKIT

Edwin Horlings | 7 / 28 | Patenting in the Netherlands 1945-2011

Page 8: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

SAINT Toolkit

• the Science Assessment Integrated Network Toolkit

• Main components at the moment− ISI Parser: convert raw Web of Science data into a relational database− Word Splitter: cuts full text into words, eliminating stop words, and

shortening words to their stem using different algorithms− Network Tools: identify clusters in network using one of the best

clustering algorithms (Blondel et al. 2008)

Edwin Horlings | 8 / 28 | Patenting in the Netherlands 1945-2011

Page 9: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

SAINT Toolkit

• Under development− Integrating all tools into a Work Flow Manager− Word splitting using Natural Language Programming to extract terms

rather than words− Adding alternative clustering algorithms for network analysis− Improve the Parser for new data sources, such as Scopus and online

data− Develop tools for disambiguation of authors and addresses

Edwin Horlings | 9 / 28 | Patenting in the Netherlands 1945-2011

Page 10: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

APPLICATIONS

Edwin Horlings | 10 / 28 | Patenting in the Netherlands 1945-2011

Page 11: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Author disambiguation

• Thousands of researchers with identical names (e.g. Y. Zhang): how to tell the difference?

• Important for evaluation and for research

• Developed an algorithm with 95-100% accuracy

• Now developing software tool with University of Paris Est (ESIEE)

Edwin Horlings | 11 / 28 | Patenting in the Netherlands 1945-2011

Page 12: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Portfolio of individual researchers

Edwin Horlings & Thomas Gurney | 12 / 28 | Search strategies along the academic lifecycle

• How do individual researchers develop their scientific portfolio?

− different stages in their career− different problem areas, often

simultaneous− coherence of their portfolio− author position

• Developed a scientometric method to visualise and statistically analyse

Page 13: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Ronald Plasterk, former Minister of Education, Science and Culture

Barend van der Meulen | 13 / 28 | SAINT Toolkit for applied scientometrics

Page 14: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Edwin Horlings | 14 / 22 | Science policy for the bio-economy

Bio-energy worldwide 8,414 publications 2008-2009

primary strength of China

secondary strength of China

primary strength of Netherlandssecondary strength of Netherlands

Page 15: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Advantage of having a large facility

• Does a large-scale facility provide home advantage to local research groups

− accumulating reputation− opening up new avenues of research− developing social networks− producing scientific

• Examine for high-field magnet laboratories, such as in Hefei and in Nijmegen

Edwin Horlings & Thomas Gurney | 15 / 28 | Search strategies along the academic lifecycle

Page 16: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Collaboration networks in graphene

Collaboration between institutesin graphene research worldwide1990-2011 (17,968 publications)

Collaboration between institutesin graphene research worldwide1990-2011 (17,968 publications)

INSTITUTION(E.G. UNIVERSITY)

INSTITUTION(E.G. UNIVERSITY)

Page 17: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Collaboration networks in graphene

Clusters of institutes thatcollaborate more with each other

than with other institutes

Clusters of institutes thatcollaborate more with each other

than with other institutes

CLUSTERCLUSTER

Page 18: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Collaboration networks in graphene

EUEU

NORTHAMERICANORTH

AMERICASOUTH EAST

ASIASOUTH EAST

ASIA

Networks of scientific collaborationin graphene are highly regionally clustered

Page 19: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Collaboration networks in graphene

All Chinese institutions in the networkhighlighted in black

Page 20: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

Collaboration in graphene research in China

Edwin Horlings & Thomas Gurney | 20 / 28 | Search strategies along the academic lifecycle

Page 21: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

How Dutch universities work on scientific topics

Edwin Horlings & Thomas Gurney | 21 / 28 | Search strategies along the academic lifecycle

Celiac Disease Consortium2000-2003

TOPICTOPIC

UNIVERSITYUNIVERSITY

Page 22: SAINT Toolkit for Applied Scientometrics Edwin Horlings August 2012

August 2012

How Dutch universities work on scientific topics

• Denser network• More institutions

involved• More coherent:

more universities work on the same small set of topics

Edwin Horlings & Thomas Gurney | 22 / 28 | Search strategies along the academic lifecycle

Celiac Disease Consortium2007-2010