What is this class about?
• The Statistical Analysis of Data This includes 2 key terms which
need some explanation
1. “Statistics”
2. “Data”
StatisticsA. What are “statistics?”
“The use of numbers to quantitatively describe or index the states of some phenomena.”
a) They are Quantitative (numerical)
b) They are Aggregate references (to groups of data)
c) They are Objective information (calculated)
They are intellectual constructions or computations (that may be useful)
a) They exist because we compute them
b) There are alternative ways to compute them
c) But they should refer to real patterns/events
d) They are valid only as long they “work”
e) Avoid reifications and “weatherman’s fallacy”
StatisticsB. What are they good for?
a) Describe things quantitatively Descriptive statistics: take the observed
data as the whole population of interest• Summarize a given set of data points• Index or categorize the data points in the set
b) Make “educated guesses” from limited info Inferential statistics: take the observed data
as a limited sample from a larger population of interest (of which we have limited info)
• Draw conclusions or inferences• Make decisions
Statistics – Limitations?Useful only for answering quantitative questions
(i.e., about amounts, degrees, or extents) Only apply to things that are countable or
measurable in objectified terms.
Are statistics inherently misleading? The famous problem of “lying with statistics”
Do we really need statistics? What are the alternatives?
We can’t escape them; they’re everywhere
The issue is to know WHEN and HOW to use them meaningfully
“Types of Analysis”A. Qualitative vs. Quantitative
Statistical analysis is necessarily quantitative i.e., We are using numbers to describe the
numerical properties or patterns of things We require data coded into countable &
measurable variables.
B. Descriptive vs. Inferential 2 basic analytic tasks in statistics:
(1) Summarize things (a set of data points) in numerical terms
(2) Make inferences and decisions from limited observations
What are “Data” (?)A. Data = information collected and
recorded (a plural noun?) Data may be Quantitative or Qualitative
Data Set contains many data points
B. The magic word for Quantitative Data = “Variables”
Variable = any attribute or property of some thing that can take on different values/states
• Must have more than one possible state• Don’t have to be numerical values
“Data” (continued)
C. Different Types of Variables?1. By their analytical function
a) Dependent variables
b) Independent variables
c) Extraneous variables
Why do functional types of variables matter in statistics?
3. “Data” (continued)
C. Different Types of Variables (cont.) 2. By Level of Measurement
a) Nominal level – numbers as labels
b) Ordinal level – numbers as relative position
c) Interval level – numbers as comparative size
d) Ratio level – numbers as absolute size The level-of-measurement represents the uses,
inferences, or meanings we make of the data. Why do Levels of variables matter in statistics? Treating ordinal data as interval data – Why
not?
3. “Data” (continued)
C. Different Types of Variables (cont.) 3. Other important distinctions?
a) Numerical vs. Nonnumeric
b) Discrete vs. Continuous
4. A very special type of variable = “Binary”a) Dichotomy only 2 possible values or outcomes
(0 & 1 as only values)
b) Examples? Yes-No; Present-Absent; Drug Use-Abstinence; Guilty-Not Guilty; Alive-Dead; Pass-Fail; Pregnant-Not Pregnant
c) Binary variables = Both Numeric and Nominal (?!)
Why are Binary variables important?1) The world consists of LOTS of dichotomous
events – (a) outcomes; (b) decisions
2) Binary numbers are very well-defined and handy (both mathematical & practical terms)
a) They are the basis for modern digital computers
b) They represent logical events
3) Many complex events = combinations of binary events
4) Binary = very useful but can present special statistical issues
Some Introductory Math Issues:1) Reading equations and formulas
• Useful to have a basic working knowledge• Memorization = unnecessary
2) Doing complex tedious arithmetic• By hand (calculator) – some will be required• By computer – most of our statistical calculations
3) “Rounding numbers”? (How many places?)• Interim calculations – carry more decimal places
for computation precision• Final results – round to most meaningful units for
ease of interpretation (usually one place more than original numbers)