bibliometric big data - bdigital.unal.edu.cobdigital.unal.edu.co/12475/7/biliometricsbigdata.pdf ·...
Post on 06-Jul-2020
7 Views
Preview:
TRANSCRIPT
Bibliometric Big Data and its Uses
Dr. Gali Halevi
Elsevier, NY
In memoriam
https://www.youtube.com/watch?v=sRBqTqtMncw
The Multidimensional Research Assessment Matrix
Unit of assessment
Purpose Output dimensions
Bibliometric indicators
Other indicators
Individual Allocate resources Research productivity
Publications Peer review
Research group Improve performance
Quality, scholarly impact
Journal citation impact
Patents, licences, spin offs
Department Increase multi-discipl. research
Innovation and social benefit
Actual citation impact
Invitations for conferences
Institution Increase regional engagement
Sustainabi-lity & Scale
Internat. co-authorship
External research income
Research field Promotion, hiring Research infrastruct.
citation ‘prestige’ PhD com-pletion rates
Unit of assessment
Purpose Output dimensions
Bibliometric indicators
Other indicators
Individual Allocate resources
Research productivity
Publications Peer review
Research group Improve performance
Quality, scholarly impact
Journal citation impact
Patents, licences, spin offs
Department Increase multi-discipl. research
Innovation and social benefit
Actual citation impact
Invitations for conferences
Institution Increase regional engagement
Sustainabi-lity & Scale
Internat. co-authorship
External research income
Research field Promotion, hiring
Research infrastruct.
citation ‘prestige’
PhD com-pletion rates
Read column-
wise
Unit of assessment
Purpose Output dimensions
Bibliometric indicators
Other indicators
Individual Allocate resources
Research productivity
Publications Peer review
Research group Improve performance
Quality, scholarly impact
Journal citation impact
Patents, licences, spin offs
Department Increase multi-discipl. research
Innovation and social benefit
Actual citation impact
Invitations for conferences
Institution Increase regional engagement
Sustainabi-lity & Scale
Internat. co-authorship
External research income
Research field Promotion, hiring
Research infrastruct.
citation ‘prestige’
PhD com-pletion rates
Indicators that are appropriate in one context may be useless or invalid in another
The choice of indicators depends upon:
What units is to be assessed
Which aspect is being assessed?
Why is the assessment done?
“Meta” assumptions on the state of the system under assessment
Recruitment of new researchers at research
universities
Select the best researchers
Rank researchers by average impact factor of
journals in which they published and select nr.
1
Meta-level : Policy
issue
Policy measure
Bibliometric- operationali-
zation
CASE 1 [My view: Inappropriate use]
Research community is not
sufficiently oriented toward
international networks
Stimulate publication in good
international journals
Count and reward articles in the first
impact quartile of journals in subject
field
Meta-level = Policy
issue
Policy measure
Bibliometric -
operationali-
zation
CASE 2 [My view: Appropriate use]
Big Data in Bibliometrics In the past decade bibliometric data expanded to include a variety
of large scale data such as: Citations References
Key words (descriptors)
Usage data Full text analytics
The availability of the data and technological capabilities brought forth a strong proliferation of bibliometric databases and data-analytical tools for the development of: Sophisticated and custom scientific evaluation Indicators Measurements of the behavior of researchers, journal editors and
publishers; societal impact indicators of research, such as its technological value
or its contribution to the enlightenment of the general public; Creation and analysis of large datasets by combining multiple
datasets.
Compound Big Datasets and their objects of study
Examples of Big Data Analysis in Bibliometrics and its uses
Relationships between Downloads & Citations
Answers questions such as:
1. How is my research impactful?
2. Are my resources optimized?
Article cycle
Citations (red curve, low numbers)
The effect of citations upon
downloads
Corrected proof online 04-03-2008
Corrected paginated proof online 22-08-2008 Downloads
14
Large differences in SD download patterns among journals
Analysis by journal and doc type
1. The number of a journal’s downloads is about 100 times its number of citations (in a 5 yr window)
2. The “advantage” of reviews over normal articles is much larger for downloads than it is for citations
3. Findings suggest that in performance measurement reviews are better handled separately
4. Large differences exist between journals in the absolute level and temporal patterns in download counts
16
Patents and scientific articles: Library Science Example Answers questions such as:
1. How does my research impacts the economy?
2. Can I direct my research better?
3. Can I foster corporate / academic relations?
Co-Citations analysis in Patents
We found that from 1999 the terms began to appear in Patents
Co-Citations analysis in Patents The first patent that uses co-citation analysis as a method is a Xerox
patent that uses co-citation analysis to generate clusters of documents in a database.
Other Assignees included Google, AT&T, Microsoft and others
Predictive Trends Modelling with Author and Index Keywords
Answers questions such as:
1. What are the upcoming trends in my discipline?
2. Who is researching in emerging areas?
Why use Author Keywords
Author keywords are assigned by the researchers themselves to describe their work
They show unique tagging of the content especially when a new discovery or methodology is concerned
When tracked over time they can show growth, adoption and development
When compared to the Index keywords they are able to show new concepts later on adopted by the mainline thesauri and indexes
http://public.tableausoftware.com/views/Presentationlink/Sheet1?:embed=y&:display_count=no
Distribution Wind Heat Engine Heat exchanger Heat pump
Thermal Performance
Citation context analysis Combining citation data with full text article data
Answers questions such as:
1. Where is your research cited
2. How multidisciplinary your research is
Citation density & sectional distribution
Most citations appear in the introduction
There are slightly more out-discipline citations than in-discipline citations
The findings section followed by the discussion section are the second and third sections where citations appear.
Towards an Author Evaluation Tool (AET)
Current state
Several metrics are widely used in research performance including:
Journal Impact Factor
H-Index
G-Index
i10-Index
(And others; See http://www.harzing.com/pophelp/metrics.htm#gindex )
Most of these methods are based on articles / citations counting and calculating their value to come up with a single number
Methods based on statistical calculations have been highly criticized due to their rigid nature , complexity and lack of context
Despite of the growing criticism of the research evaluation methodologies, evaluative metrics are still needed for:
Research and researcher performance
Research funding
Lessons learned
When developing new indicators/metrics we need to Base our work established and accepted methods
Combine methods in a way that gives proper weight to each
Take into account every data point possible so that the indicator is as comprehensive as possible
Make the method flexible enough to be able and contain future data points and/or methods
Allow the person or institution measured certain amount of control over the data being calculated
Aims for a simple, understandable metric that is transparent and easy to use
Basic characteristics of the Self-Organizing Research Assessment Metrics
Our model of author assessment combines proper benchmarking with data accuracy enhancement tools
The tool facilitates completeness check and corrections of assessed author’s publication list
Compares an assessed author with other researchers who are active in the same subject field
Compare an assessed author with other researchers who are in the same phase of their scientific career
Can be easily used by tenured and young researchers as a self-assessment tool , and also by administrators and analysts
Selecting an author and collecting publications data
We tested level 2 of the
model using author data
found in Scopus in the
field of energy
Creating the field appropriateness via cited references
This includes the journal,
authors and years of
publication of the cited
references
Creating the benchmarking network
Selecting the appropriate benchmarking network
Author Evaluation Metrics
Number of articles are above the median compared to the benchmarked network
Citations are slightly below the median compared to the benchmarked network
Citations per article are at the bottom 25% compared to the benchmarked network
The author has a high publication rate but it is not as impactful
XXX
34
Valuable notions and distinctions
Data accuracy is crucial
Use data verified by authors themselves
Combine metrics and expert knowledge
Impact factors are no substitutes of actual impact
Use multiple indicators
Take into account career phase
Take into account unintended effects
Focus on top vs. bottom of quality distribution
Co-Authorship & Collaboration
Answers questions such as:
1. How diverse are my collaborations?
2. Can I expand my research collaborations?
3. Who and in which countries should I be collaborating with and on what?
Main research questions and bibliometric indicators used in this study
A bibliometric model for capturing the state of scientific development
PRE- DEVELOPMENT Low research activity without clear policy or structural funding of research.
BUILDING- UP
Collaborations with developed countries are established. National researchers enter international scientific networks.
CONSOLIDATION AND EXPANSION
The country develops its own scientific infrastructure. The amount of funds available for research increases.
INTERNATIONALISATION
Research institutions in the country start functioning as fully fledged partners, and increasingly take the lead in international collaborations.
Colombia’s Scientific Collaborations
Subject Areas
Colombia’s Collaborations in Medicine
Colombia’s Collaborations in Agriculture
Dr. Gali Halevi | Senior Research Analyst and Program Director Email: g.halevi@elsevier.com
top related