statistics for librarians, session 2: descriptive statistics
DESCRIPTION
The second in a series of four seminars presented to University of North Texas librarians. This presentation focuses on organizing and presenting basic descriptive statistics, including measures of central tendency and variation.TRANSCRIPT
![Page 1: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/1.jpg)
E X P LO RAT O RY DATA A N A LY S I S
DESCRIPTIVE STATISTICS
![Page 2: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/2.jpg)
REVIEW
![Page 3: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/3.jpg)
Results
Bias?
Sampling Error?
Invalid Measures
?
Random Error?
Other Factors?
PURPOSE OF STATISTICS
![Page 4: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/4.jpg)
VARIABLES
Independent
Subjects
Factors
Effects of…
Dependent
Objects
Outcomes
Effects on…
![Page 5: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/5.jpg)
SCALES OF DATA (NOIR)
Nominal• Counts by
category• Binary (Yes/No)• No meaning
between the categories (Blue is not better than Red)
Ordinal• Ranks• Scales• Space between
ranks is subjective
Interval• Integers• Zero is just
another value – doesn’t mean “absence of”
• Space between values is equal and objective, but discrete
Ratio• Interval data with
a baseline• Zero (0) means
“absence of” • Space between is
continuous• Includes simple
counts
![Page 6: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/6.jpg)
ANOTHER WAY
• Counts by Categories• Ranks• Scales
Qualitative
• Measurements• Composite scores• Simple Counts
Quantitative
![Page 7: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/7.jpg)
EXAMPLE DATA SETPACS FACULTY CITATION ANALYSIS
![Page 8: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/8.jpg)
RESEARCH QUESTION
Does UNT Libraries provide access to the resources used by PACS faculty, based on references in their published works?
![Page 9: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/9.jpg)
PACS STUDY VARIABLES
• Department• Years at UNTFaculty
• # published by type• Rankings of journalsPublished
• # cited by type• Rankings of journals• UNT accessible
Cited
IV
DV
![Page 10: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/10.jpg)
PACS STUDY VARIABLES BY SCALE
• # of publications by type
• # of citations by type• # references available
Qualitative
• Years at UNT• Years since PhD
Quantitative
![Page 11: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/11.jpg)
EXPLORATORY DATA ANALYSIS
GETTING TO KNOW YOUR DATA, INTIMATELY
![Page 12: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/12.jpg)
DISTRIBUTIONS
![Page 13: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/13.jpg)
QUALITATIVE DATA
Tables• Counts• Percentages/Ratios
• By row and column
Excel• Pivot Tables
![Page 14: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/14.jpg)
TABLES
DepartmentNum
Faculty% of
Faculty
Anthropology 20 18%
Behavior Analysis 17 15%
Criminal Justice 18 16%
Public Administration 19 17%
Rehab, Social Work, & Addictions 18 16%
Sociology 21 19%
Totals 113 100%
DepartmentArticl
e%
Articles OtherAnthropology 73 61% 46Behavior Analysis 65 81% 15Criminal Justice 54 69% 24Public Administration 64 58% 47Rehabilitation, Social Work, and Addictions 49 82% 11Sociology 83 62% 50
Totals 388 67% 193
Availability # Refs%
Available 586 79.62%
Title not avail 134 17.66%
Year not avail 23 2.72%
Grand Total 743100.00
%
![Page 15: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/15.jpg)
Department Article Article % BookBook
% Other Total
Anthropology 1152 666 2012
Behavior Analysis 1412 289 1740
Criminal Justice 1220 624 2003Public Administration 966 561 1724Rehabilitation, Social Work, and Addictions 852 365 1282
Sociology 2238 1558 3970
Totals
Department Article Article % BookBook
% Other Total
Anthropology 1152 57% 666 33% 194 2012
Behavior Analysis 1412 81% 289 17% 39 1740
Criminal Justice 1220 61% 624 31% 159 2003Public Administration 966 56% 561 33% 197 1724Rehabilitation, Social Work, and Addictions 852 66% 365 28% 65 1282
Sociology 2238 56% 1558 39% 174 3970
Totals 7840(avg) 63% 4063 30% 828 12731
ACTIVITY 1
![Page 16: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/16.jpg)
GRAPHS
0%
40%
80%
% Articles by Department
Anthropology
Behavior Analysis
Criminal JusticePublic Adminis-tration
Rehabilitation, Social Work, and
Addictions
Sociology
% of Faculty
![Page 17: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/17.jpg)
GRAPH & CHART RULES OF THUMB
TrendsConnection
across the X-axis
CategoricalComparisons
GroupedStackedRelative Stacked
CategoricalFew
CategoriesDifferences are Wide
![Page 18: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/18.jpg)
ACTIVITY 2
Draw a bar graph of References by Type
Department Article Article % BookBook
% Other Total
Anthropology 1152 57% 666 33% 194 2012
Behavior Analysis 1412 81% 289 17% 39 1740
Criminal Justice 1220 61% 624 31% 159 2003
Public Administration 966 56% 561 33% 197 1724
Rehabilitation, Social Work, and Addictions 852 66% 365 28% 65 1282
Sociology 2238 56% 1558 39% 174 3970
Totals 7840(avg) 63% 4063 30% 828 12731
0
2000
4000
OtherBookArticle
![Page 19: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/19.jpg)
QUANTITATIVE DISTRIBUTIONS
Stem & Leaf
Histogram
Distribution graphs
![Page 20: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/20.jpg)
EXPLORATORY DATA ANALYSIS
• John W. Tukey• Exploratory Data
Analysis• Examining your
data visually.• Stem & Leaf• Hinges• Box plots• Scatter plots, etc.
![Page 21: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/21.jpg)
STEM-AND-LEAF
Stem
Leaf
0 1122223334445555666666677777899
1 000011122222222333346677889
2 0122234468
3 1112355888
4 12
First digit(s)
Last digit
![Page 22: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/22.jpg)
ACTIVITY 3
Create a stem-and-leaf table for Years at UNT.
Stem Leaf
0 01112222222222222233333344445556666677788899
1 0000000011122223333356778899
2 00122234444799
3 0245
![Page 23: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/23.jpg)
FROM STEM-AND-LEAF TO HISTOGRAMS
![Page 24: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/24.jpg)
Stem
Leaf Count
0 1122223334445555666666677777899
31
1 000011122222222333346677889 27
2 0122234468 10
3 1112355888 11
4 12 2Range Count
0-9 31
10-19 27
20-29 10
30-39 11
40-49 2
0-9 10-19 20-29 30-39 40-490
10
20
30
40
Histogram of Years at UNT
![Page 25: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/25.jpg)
ACTIVITY 4
Create a histogram of the Years at UNT
Stem Leaf
0 01112222222222222233333344445556666677788899
1 0000000011122223333356778899
2 00122234444799
3 0245
Stem
Leaf Count
0 01112222222222222233333344445556666677788899
44
1 0000000011122223333356778899 28
2 00122234444799 14
3 0245 40-9 10-19 20-29 30-39
0
10
20
30
40
50
Years at UNT
![Page 26: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/26.jpg)
PIVOT TABLES
Select
Data
• Highlight table• Insert->Pivot Table
Select
Variable
s
• Categories (Row Labels)• Values
Change Settings
• Percentage of Grand Total• Average
![Page 27: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/27.jpg)
DEMONSTRATION OF PIVOT TABLES IN EXCEL
![Page 28: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/28.jpg)
HISTOGRAMS IN EXCEL
• Options• Add-ins• Manage Add-ins
Analysis Toolpak
• Equal spacing• Enter the highest
# for each range• Ceiling (“more”)
Set ranges• Data• Data Analysis• Histogram
Create Histogram
• Insert Bar Chart• Highlight
histogram• Select bars &
Format Selection• Gap Width=0%
Create Graph
![Page 29: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/29.jpg)
DEMONSTRATION OF HISTOGRAM IN EXCEL
![Page 30: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/30.jpg)
MEASURES OF CENTRAL TENDENCY
• Average
Mean
• Middle
Median
• Most Common
Mode
![Page 31: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/31.jpg)
CENTRAL TENDENCY BY SCALES
Quantitative
Mean
Median
Qualitative
Median--not
Nominal
Mode
![Page 32: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/32.jpg)
ACTIVITY 5
# Available
Mode
# References by TypeMode
Years Since PhDMean Median
Years at UNTMean Median
![Page 33: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/33.jpg)
MEAN
Sum of all the values divided by the count of values
= sample mean∑ = “sum of…”X = values of the variablen = number of values
![Page 34: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/34.jpg)
EXCEL FUNCTIONS FOR MEASURES OF CENTRAL TENDENCY
• =Average(range)Mean
• =Median(range)Median
• =Mode(range)Mode
![Page 35: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/35.jpg)
SPREAD (REVIEW)
Quantitative
• Range• Quartiles
or Quintiles
• Standard Deviation
Qualitative
• Distribution Tables
• Bar Graphs
How variable is the data?
![Page 36: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/36.jpg)
RANGE & QUARTILES
![Page 37: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/37.jpg)
PRESENTATION OF SPREAD
• Box plots• Median• Upper & lower
quintiles• Outliers
• Cross-tabulations• Bar graphs
![Page 38: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/38.jpg)
BOXPLOT IN EXCEL
Set parameters
• Median• Quartile 1• Minimum • Maximum• Quartile 3
Use Excel functions
• Median(range)• Quartile.inc(range,1
)• Min(range)• Max(range)• Quartile.inc(range,3
)
Insert Chart
• Highlight both columns
• Select a bar chart
• Switch the columns & rows
• Modify the formats of each element
• YouTube tutorial
![Page 39: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/39.jpg)
STANDARD DEVIATION
•Measure of dispersion of data•Square root of the average variation from the mean
![Page 40: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/40.jpg)
STANDARD DEVIATION WORKED OUT
Years since PhD ()
Mean ()
Difference from Mean
Difference from Mean Squared
1 14.86 -13.86 192.216
1 14.86 -13.86 192.216
2 14.86 -12.86 165.4876
14 14.86 -0.86 0.746837
16 14.86 1.14 1.290047
41 14.86 26.14 683.0802
42 14.86 27.14 736.3518
n=81 14.86 0.00 9931.506
![Page 41: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/41.jpg)
WORK IT OUT
𝑠=√𝟗𝟗𝟑𝟏 .𝟓𝟎𝟔(𝟖𝟏−1 )
𝑠=√124.1438
𝑠=√ 9931.50680
𝑠=11.14198
![Page 42: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/42.jpg)
SPREAD IN EXCEL
• =Min(range)• =Max(range)Range
• =Percentiles.inc(range, %)
• =Quartile.inc(range, {1,2,3,4})
Quantiles
• =STDEV.S(range)Standard Deviation
![Page 43: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/43.jpg)
WHAT DOES THE STANDARD DEVIATION TELL YOU?
Greater variation, less certainty
Lower variation, more certainty
![Page 44: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/44.jpg)
FROM HISTOGRAMS TO FREQUENCY DISTRIBUTIONS
![Page 45: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/45.jpg)
NORMAL DISTRIBUTIONS
![Page 48: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/48.jpg)
BIVARIATE ANALYSIS
![Page 49: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/49.jpg)
SCATTER PLOT
Relationship of two variables
Quantitative Only
![Page 50: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/50.jpg)
CORRELATIONS
Direct• As x increases, y
increases
Indirect• As x increases, y
decreases
No Correlation
![Page 51: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/51.jpg)
DEMONSTRATION OF SCATTER PLOT IN EXCEL
• Highlight both columns
Select Data
• Scatter• Layout 9
Insert graph• X-axis label• Y-axis label
Change Labels
![Page 52: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/52.jpg)
CROSS-TABULATIONS
Qualitative Two Variables
Fewer Categories
Row Percentage
Column Percentage
Pivot Tables in Excel
![Page 53: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/53.jpg)
CONTINGENCY TABLE
Test A/B Yes No Total
Yes 10 15 25
No 50 25 75
Totals 60 40 100
Simple Cross-tab
Two Binomial Variables
• Odds Ratios & Risk Ratios
Powerful Statistics
![Page 54: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/54.jpg)
IMPORTANCE OF DESCRIPTIVE STATISTICS
DescribesPopulationSampleResults
Compares
Sample to Population
Sub-groupsCorrelations
Summarizes
Central TendencySpread
![Page 55: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/55.jpg)
PROGRESSION FROM DESCRIPTIVE TO INFERENTIAL STATISTICS
Central Tendency
Spread
Distributions
Probability
Inferential Statistics
![Page 56: Statistics for Librarians, Session 2: Descriptive statistics](https://reader036.vdocument.in/reader036/viewer/2022070313/554992b4b4c905b96a8b55e3/html5/thumbnails/56.jpg)
RESOURCESRice Virtual Lab in Statistics
Excel Tutorials for Statistical Analysis
Khan Academy - videos
Basic Research Methods for Librarians
– ebook
Descriptive Statistical Techniques for Librarians
- ebook