Bibliometric research methods
Faculty Brown Bag
IUPUI
Cassidy R. Sugimoto
Overview
Vocabularly Citation analysis Citation indices Bibliometric laws Impact factor Applications
Vocabulary Scholarly Communications
Formal and information Scientometrics
Scientific communication Infometrics
Thinking beyond scholarly “texts” Webometrics
web Bibliometrics
Application of statistical and mathematical methods (formal channels)
Citation analysis
Why do people cite? Why are some articles not cited? What does a citation mean?
Citing document
Cited document
B is cited by A
A B
A references B
Who’s on first?
Embedded citation index from ` En mishpat: Babylonian Talmud (1546)
(Weinberg, 1997)
Shepard’s Citation Index (1873)
Shapiro (1992)
Institute for Scientific Information (ISI)
Scopus
GoogleScholar
Comparison
Overlap57%
(4,892)
Scopus29%
(2,441)
Web of Science
14%(1,216)
Scopusn=7,333 (86%)
Web of Sciencen=6,108 (71%)
Distribution of unique and overlapping citations in Scopus and Web of Science (n=8,549)
Are you a citation index?
Bibliometric research OR “Why I love good indexes”
Citation analysis
Citing document
Cited document
B is cited by A
A B
A references B
Citation analysis: methods
Not just articles…
Variable:PRODUCERS
Variable:PRODUCERS
Variable:ARTIFACTS
Variable:CONCEPTS
Hybrid approaches
Chaomei Chen: http://www.pages.drexel.edu/~cc345/citespace/figures/terrorism1990-2003-300dpi.png
h-index
Hirsch (2005) A scientist has index h if h of [his/her] Np
papers have at least h citations each, and the other (Np − h) papers have at most h citations each.
Bibliometric laws
Lotka’s Law (1926)the number (of authors) making n contributions is about 1/n² of those making one; and the proportion of all contributors, that make a single contribution, is about 60 percent (60,15,7…6>10)
Not statistically exact
May be changing with the current model of scholarship
Bibliometric laws
Bradford’s law (1934)
Journals in a field can be divided into three parts:1) Core: relatively few # of journals producing 1/3 of all articles2) Zone 2: same # of articles, but > # of journals3) Zone 3: same # of articles, but > # of journals
The mathematical relationship of the number of journals in the core to the first zone is a constant n and to the second zone the relationship is n².
1:n:n²
Not statistically exact
General power law distribution (akin to Pareto’s law in economics)
Bibliometric laws
Zipf’s Law (1935)
Not statistically exact
General power law probability distribution
listing the words occurring within that text in order of decreasing frequency, the rank of a word on that list multiplied by its frequency will equal a constant. The equation for this relationship is: r x f = k where r is the rank of the word, f is the frequency, and k is the constant
James Joyce's Ulysses10th most frequent: 2,653 times100th most frequent: 265 times200th most frequent: 133 times
rank of the word multiplied by the frequency of the word equals a constant that is approximately 26,500
Bibliometric laws
Other power law probability distributions Pareto’s law (economics)
80-20 rule Law of the vital few Principle of factor sparsity
PageRank (google) The Long Tail (markets)
Journal impact factors
As a research method…
Reliability? Validity? Limitations?
Applications?
Finding and use Collection development Reference services Collection evaluation
Use studies Information retrieval algorithms Diffusion of ideas Domain areas and interdisciplinarity Mapping science
Writing your paper…