Download - Bibliometrics: From Garfield to Google Scholar

Bibliometrics: From Garfield to Google Scholar

Elaine M. Lasda BergmanUniversity at Albany

Upstate NY SLA Spring MeetingApril 20, 2012

What we’re going to cover

• What is the study of Bibliometrics?• Bibliometrics which assess entire Journals – JIF, Eigenfactor, SNIP, SJR

• Bibliometrics assessing authors, articles, institutions– citation count, H-index, e-index, etc. etc. etc.

What is bibliometrics?

Eugene Garfield• Scholarly communication:

tracing the history and evolution of ideas from one scholar to another

• Measures the scholarly influence of articles, journals, scholars, institutions

Three sources for citation data

Three sources for citation data

• Citation data overlaps, but not completely• Unique citing references in all three databases• Unique metrics developed using each

database– Metrics could be computed in any one of these

but most are tied to a particular source

JOURNAL METRICS

What is measured?

• Journal Ranking– “Quality” or “Importance” of journal relative to

other journals• Usually within a given field of study

• There are many ways to measure “quality,” “importance”

“Impact”

• Journal Impact Factor (JIF)• Web of Science – Journal Citation Reports• Basically “how fast are ideas spreading from

this journal to other publications?”• Formula is a ratio:

Number of citations to a journal in a given year from articles occurring in the past 2 years,

divided by the number of scholarly articles published in the journal in the past 2 years

Journal Impact Factor

• Journal of Hypothetical ExamplesCiting references appearing in 2010, to articles published in Journal in 2009 and 2008

100

200 Total number of articles in Journal published in 2009 and 2008

0.50 JIF

• Cannot be used to compare across disciplines• Two year time frame not adequate for social

sciences, humanities• Coverage of some disciplines not sufficient in

Web of Science• Is a measure of “impact” a measure of

“quality”?

Concerns with impact factor

“Influence”

• Eigenfactor.org • Web of Science: Journal Citation Reports• Eigenvector analysis: Similar to Google

PageRank, “chain of citations”• Takes into account the total amount of

“citation traffic” appearing in JCRInfluence of the citing journal, Divided by the total number of citations appearing in that journal.

“Influence”

• Journal Impact Factor: – All citing references weighted equally

• Eigenfactor: – SOME CITING REFERENCES ARE MORE

IMPORTANT THAN OTHERS• The citing articles from journals that are heavily cited

themselves demonstrate greater influence

Considerations

• Eigenfactor will always be bigger if a journal is larger, i.e., publishes more articles

• Article Influence Score: corrects for journal size– takes the journal’s Eigenfactor score and further

divides it by the number of articles in the journal.– Correlation to the JIF

Examples

• For the year 2011, Neurology had an eigenfactor score of .159. This number = % of all citation traffic of articles in the JCR

• For the year 2011, Neurology had an article influence score of 2.57. This means an average article in this journal is roughly 2 ½ X more influential than an average article in all of JCR

• www.eigenfactor.org

http://www.eigenfactor.org/

“Citation Potential”

• SNIP: Source Normalized Impact Per Paper• Uses Scopus data• Citation Potential = total number of citing

references in all journals which have cited this journal

• Takes an average citation countThe ratio of the journal’s average citation count per paper to the citation potential in its subject field

Pros and cons of SNIP

• Can compare SNIP scores across disciplines

• Aggregate of a journal, so larger journals automatically have higher scores than smaller journals

“Prestige”

• SJR: Scimago Journal Rank• Uses Scopus data• Measures “current average prestige per

paper”Prestige factors include: # of journals in the Scopus database, # of articles in Scopus from this journal, citation count, eigenvector analysis of important citing references, corrections for self-citations, and normalization by the number of significant works published in the journal.

Pros and Cons of SJR

• Corrects for self citations• Correlated to JIF• Scores can be compared across disciplines• Web version provides data on countries• Three year window not good for social sciences• http://www.scimagojr.com/

http://www.scimagojr.com/



Examples in Scopus

METRICS FOR SCHOLARS, AUTHORS, INSTITUTIONS, ETC.

• Number of times cited within a given time period– Journals, Authors, Articles, etc.

• Does not take into account– Materials not included in citation database– Self citations– Variations in citation patterns/rates

Citation count

Citation count

• Citation counts will vary depending on which database you use

• It is very difficult to get a complete count of all citing references

H-index

• Scopus, Google Scholar, WoS?• Meant to account for differences in citation

patterns (i.e., “one-hit wonders” vs. consistent record of scholarship)

“A scientist has index h if h of his/her Np papers have at least h citations each and the other (Np-h) papers have no more than h citations each” (Hisrch 2005)

1 2 3 4 5 6 70

5

10

15

20

25

30

H-indexScholar AScholar B

Article Number

Num

ber o

f Cita

tions

H-index ExampleScholar A Scholar B

10 2710 129 58 47 46 26 2

56 citations 56 citations6 h-index 4 h-index

Variations on the H-index• G-index (Egghe 2006): gives greater weight to highly cited articles

– The top g number of articles have received a combined total of g2 citations

• E-index (Zhang 2009): gives greater weight to highly cited articles – The square root of the surplus of citations in the h-set beyond h2

• Contemporary h-index (Sidiropolous, et. Al. 2006): gives greater weight to newer articles– “parameterized”: current year, citations count 4 times, four years

ago, citations count 1 time, 6 years ago, citations count 4/6 times

Variations on the H-index• Individual h-index (Batista, et al. 2006)accounts for co-authors

– Divides the h-index by the average number of authors per paper• Alternative individual h-index (Harzing): accounts for co-authors

– Normalizes citation counts: divides # of citations by average # of authors per each paper and then computes the h-index

• Another alternative individual h-index (Schreiber 2006): accounts for co-authors– Divides by fractions of papers instead of # of authors, keeps full

citation count

Variations on the H-index

• Age weighted citation rate and AW index (Jin 2007): accounts for variations in citation patterns over time– AWCR= The square root of the sum of all age-weighted citation

counts over all papers that contribute to the h-index– AW-index= the square root of the AWCR – Per-author AWCR: AWCR divided by number of authors for each

paper

Publish or Perish

• Google scholar citation information• Interdisciplinary topics, fields relying on

conference papers or reports• Greatest variety of metrics• Dirty data• Unverified data• Nonscholarly sources

Differences in H-index

Scopus vs. Google Scholar (PoP)The Case of Eugene Garfield

PoP Interface

PoP Search for Garfield

An aside: Why I don’t like PoP for Journal Metrics

Scopus Search for Garfield


Citation overview


Link to graphic information next to citation overview

Google scholar citations

http://scholar.google.com/intl/en/scholar/citations.html

http://scholar.google.com/intl/en/scholar/citations.html

Microsoft Academic

http://academic.research.microsoft.com/




• Don’t measure an individual article’s impact by the metrics for the entire journal

• Do I need a comparison within a discipline or across disciplines?

• Does the citation pattern matter or just the count?• Does the database being used cover my subject as

thoroughly as possible?• To what degree does my subject area rely on non-

journal scholarly publications?• Not all citing references are positive!

Considerations

Questions??

Elaine Lasda [email protected]

Download - Bibliometrics: From Garfield to Google Scholar

Top Related