statistics – or 155 section 1 j. s. marron, professor department of statistics and operations...
TRANSCRIPT
![Page 1: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/1.jpg)
Statistics – OR 155Section 1
J. S. Marron, Professor
Department of Statistics
and Operations Research
![Page 2: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/2.jpg)
Class Information
Handoutshttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassInfo/Stor155-09FirstHandout.pdf
With:
• Blackboard Info
• Student Survey
(please fill out & return after class)
![Page 3: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/3.jpg)
Class Information
Go to Blackboard (for class details):
• Website: http://blackboard.unc.edu/
• Log-in with Onyen
• Choose this course
• Control Panel > Content Areas
• Course Information
• Choose Item “Course Information”
![Page 4: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/4.jpg)
Relationship to Textbook
• Ordering of material in textbook is usual
• But I don’t like it
(poorly motivated)
• So will change the order of the material
(for better motivation)
• Will jump around a lot through the text
![Page 5: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/5.jpg)
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 1-5, 197-203, 203-208
Approximate Reading for Next Class:
Pages 237-250
![Page 6: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/6.jpg)
What is Statistics?
Definition 1:
Gaining Insight from Numbers
(similar to text’s definition)
Definition 2:
The Science of Managing Uncertainty
![Page 7: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/7.jpg)
What is Statistics?
Subtopics:
• Gathering the Numbers– E.g. Statistician at a ball game– Will see: how this is done is critical
• Forming Conclusions– Will use math, etc.– Major focus of this course
![Page 8: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/8.jpg)
Key Themes
I. Uncertainty
II. Variability
(will get quantitative about these)
Favorite Quote:“I was never good at math, but statistics is
easy, since it is just common sense”
![Page 9: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/9.jpg)
Motivating Examples
1. Political Polls– Try to predict outcome of election– Too expensive to ask everyone– So ask some (hope they are “representative”)
2. Measurement Error– No measurement is exact
– Can improve by multiple measurements– How to model?
Lessons of these are broadly applicable
![Page 10: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/10.jpg)
Common Structure
For both, find out abouttruth from a sample
E.g. 1: % for Cand. in population
% for Cand. in sample
E.g. 2: true sizeobserved measurement
![Page 11: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/11.jpg)
Motivating Examples
1. Political Polls2. Measurement Error
Will study each using mathematical models
Do E.g. 1 first, since easier
Appropriate Models?
![Page 12: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/12.jpg)
Political Polls
Appropriate Mathematical Models?
Depends on how data are gathered.
See Text, pages 171-177
• Seems easy???
• “Just choose some”???
• Take a look at history…
![Page 13: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/13.jpg)
How to sample?History of Presidential Election Polls
During Campaigns, constantly hear in news “polls say …” How good are these? Why?
![Page 14: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/14.jpg)
How to sample?History of Presidential Election Polls
During Campaigns, constantly hear in news “polls say …” How good are these? Why?
1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R
![Page 15: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/15.jpg)
How to sample?History of Presidential Election Polls
During Campaigns, constantly hear in news “polls say …” How good are these? Why?
1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R
Result: 62% for R
![Page 16: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/16.jpg)
How to sample?History of Presidential Election Polls
During Campaigns, constantly hear in news “polls say …” How good are these? Why?
1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R
Result: 62% for R
What happened?Sample size not big enough? 2.4 million
Biggest Poll ever done (before or since)
![Page 17: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/17.jpg)
Bias in SamplingBias: Systematically favoring one outcome
(need to think carefully)
Selection Bias: Addresses from L. D.
readers, phone books, club memberships
(representative of population?)
Non-Response Bias: Return-mail survey
(who had time?)
![Page 18: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/18.jpg)
How to sample?1936 Presidential Election (cont.)
Interesting Alternative Poll:
Gallup: 56% for R (sample size ~ 50,000)
Gallup of L.D. 44% for R ( ~ 3,000)
![Page 19: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/19.jpg)
How to sample?1936 Presidential Election (cont.)
Interesting Alternative Poll:
Gallup: 56% for R (sample size ~ 50,000)
Gallup of L.D. 44% for R ( ~ 3,000)
Predicted both correct result (62% for R),
and L. D. error (43% for R)!
(how was improvement done?)
![Page 20: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/20.jpg)
Improved Sampling
Gallup’s Improvements:
(i) Personal Interviews
(attacks non-response bias)
(ii) Quota Sampling
(attacks selection bias)
![Page 21: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/21.jpg)
Quota SamplingIdea: make “sample like population”
So surveyor chooses people to give:i. Right % male
ii. Right % “young”
iii. Right % “blue collar”
iv. …
This worked fairly well (~5% error), until …
![Page 22: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/22.jpg)
How to sample?1948 Dewey Truman sample size
![Page 23: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/23.jpg)
How to sample?1948 Dewey Truman sample size
Crossley 50% 45%
Gallup 50% 44% ~50,000
Roper 53% 38% ~15,000
![Page 24: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/24.jpg)
How to sample?1948 Dewey Truman sample size
Crossley 50% 45%
Gallup 50% 44% ~50,000
Roper 53% 38% ~15,000
Actual 45% 50% -
![Page 25: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/25.jpg)
How to sample?1948 Dewey Truman sample size
Crossley 50% 45%
Gallup 50% 44% ~50,000
Roper 53% 38% ~15,000
Actual 45% 50% -
Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”
![Page 26: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/26.jpg)
How to sample?Note: Embarassing for polls, famous photo
of Truman + Headline “Dewey Wins”
![Page 27: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/27.jpg)
What went wrong?Problem: Unintentional Bias
(surveyors understood bias,
but still made choices)
![Page 28: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/28.jpg)
What went wrong?Problem: Unintentional Bias
(surveyors understood bias,
but still made choices)
Lesson: Human Choice can not give a Representative Sample
![Page 29: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/29.jpg)
What went wrong?Problem: Unintentional Bias
(surveyors understood bias,
but still made choices)
Lesson: Human Choice can not give a Representative Sample
Surprising Improvement: Random Sampling
Now called “scientific sampling”
Random = Scientific???
![Page 30: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/30.jpg)
Random SamplingKey Idea: “random error” is smaller than
“unintentional bias”, for large enough sample sizes
![Page 31: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/31.jpg)
Random SamplingKey Idea: “random error” is smaller than
“unintentional bias”, for large enough sample sizes
How large?
Current sample sizes: ~1,000 - 3,000
![Page 32: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/32.jpg)
Random SamplingKey Idea: “random error” is smaller than
“unintentional bias”, for large enough sample sizes
How large?
Current sample sizes: ~1,000 - 3,000
Note: now << 50,000 used in 1948.
So surveys are much cheaper
(thus many more done now….)
![Page 33: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/33.jpg)
Random Sampling
How Accurate?
• Can (& will) calculate using “probability”
• Justifies term “scientific sampling”
• 2nd improvement over quota sampling
![Page 34: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/34.jpg)
Random SamplingWhat is random?
Simple Random Sampling:
Each member of population is
equally likely to be in sample
Key Idea: Different from “just choose some”
![Page 35: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/35.jpg)
Random SamplingAn old (but still fun?) experiment:
Choose a number among 1,2,3,4
![Page 36: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/36.jpg)
Random SamplingAn old (but still fun?) experiment:
Choose a number among 1,2,3,4
Old typical results: about 70% choose “3”
(perhaps you have seen this before…)
![Page 37: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/37.jpg)
Random SamplingAn old (but still fun?) experiment:
Choose a number among 1,2,3,4
Old typical results: about 70% choose “3”
(perhaps you have seen this before…)
Main lesson: human choice does not give “equally likely” (i.e. random sample)
![Page 38: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/38.jpg)
Random Sampling
How to choose a random sample?
Old Approaches:
– Random Number Table
– Roll Dice
Modern Approach:
– Computer Generated
![Page 39: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/39.jpg)
Random Sampling HWInteresting Question:
What is the % of Male Students at UNC?
(Your chance of date,
or take 100% - to get your chance)
HW:
C1: Class Handouthttp://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf
![Page 40: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/40.jpg)
Random Sampling HWNotes on HW C1:• 3 dumb ways to sample, 1 good one• Goal is to learn about sampling,
Not “get right answer”• Part 1, put symbol for yourself, Ms and Fs
for others• Put both count & % (%100 x count / 25)• Part 2, “tally” is:• Part 4, student phone directory available
in Student Union?
![Page 41: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/41.jpg)
Random Sampling HWNotes on HW C1,
• Hints on Part 4:– For each draw, first draw a “random page”– Tools Data Analysis Random Number
Generation Uniform is one way to do this– In “Uniform”, you need to set “Parameters”, to
0 and “number of pages”– This gives a random decimal, to get an
integer, round up, using CEILING– In CEILING, set “significance” to 1
![Page 42: Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research](https://reader035.vdocument.in/reader035/viewer/2022062322/5697bf7d1a28abf838c849b2/html5/thumbnails/42.jpg)
Random Sampling HWNotes on HW C1,
• Hints on Part 4 (cont.):– Next Choose Random Column– Next Choose Random Name– Caution: Different numbers on each page.– Challenge: still make equally likely– Approach: choose larger number– Approach: when not there, just toss it out– Approach: then do a “redraw”– Also redraw if can’t tell gender