udm msc course in education & development 2013 [email protected]@gmail.com...
TRANSCRIPT
![Page 1: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/1.jpg)
U D M M SC C O U R S E I N E D U C AT I O N & D E V E L O P M E N T 2 0 1 3
N i c h o l a s S p a u l l @ g m a i l . c o m – w w w. n i c s p a u l l . c o m / t e a c h i n g
Day 2: Core statistics 101
![Page 2: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/2.jpg)
Introduction
What are statistics? “the practice or science of collecting and analysing
numerical data in large quantities”
Why do we need descriptive statistics? When we look at large amounts of data, there is very
little “face value” information. If you had a dataset listing the income of 10,000 people and someone asked you if the income of the group was high or low it would be difficult to answer that question without using summary statistics (mean, median, mode etc.).
![Page 3: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/3.jpg)
3
Types of Data
Data
Categorical Numerical
Discrete Continuous
![Page 4: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/4.jpg)
4
Types of Data
Data
Categorical Numerical
Discrete Continuous
Examples:
Marital Status Political Party Eye Color (Defined categories)
Examples:
Number of Children Defects per hour (Counted items)
Examples:
Weight Voltage (Measured characteristics)
![Page 5: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/5.jpg)
5
Collecting Data
Secondary SourcesData Compilation
Observation
Experimentation
Print or Electronic
Survey
Primary SourcesData Collection
![Page 6: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/6.jpg)
Sampling
What is a sample? A sample is “a small part or quantity intended to show
what the whole is like”Why do we use samples rather than the
population?
![Page 7: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/7.jpg)
7
Descriptive Statistics
Collect data e.g., Survey
Present data e.g., Tables and graphs
Characterize data e.g., Sample mean =
iX
n
![Page 8: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/8.jpg)
Measures of Central Tendency
Central Tendency
Mean Median Mode
n
XX
n
ii
1
Midpoint of ranked values
Most frequently observed value
![Page 9: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/9.jpg)
9
Mean
The most common measure of central tendencyMean = sum of values divided by the number of
valuesAffected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
35
15
5
54321
4
5
20
5
104321
![Page 10: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/10.jpg)
10
Median
In an ordered array, the median is the “middle” number (50% above, 50% below)
Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
Median = 3
0 1 2 3 4 5 6 7 8 9 10
Median = 3
![Page 11: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/11.jpg)
Finding the Median
The location of the median:
If the number of values is odd, the median is the middle number
If the number of values is even, the median is the average of the two middle numbers
Note that is not the value of the median, only
the position of the median in the ranked data
dataorderedtheinposition2
1npositionMedian
2
1n
![Page 12: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/12.jpg)
12
Mode
A measure of central tendencyValue that occurs most oftenNot affected by extreme valuesUsed for either numerical or categorical
(nominal) dataThere may be no modeThere may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
![Page 13: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/13.jpg)
13
Five houses on a hill by the beach
Review Example
$2,000 K
$500 K
$300 K
$100 K
$100 K
House Prices:
$2,000,000 500,000 300,000 100,000 100,000
![Page 14: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/14.jpg)
14
Review Example: Summary Statistics
Mean: ($3,000,000/5) = $600,000
Median: middle value of ranked data = $300,000
Mode: most frequent value = $100,000
House Prices:
$2,000,000 500,000 300,000 100,000 100,000
Sum $3,000,000
![Page 15: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/15.jpg)
Mean, median, mode and range
Mean = the average valueMedian = the middle value in an ordered list of dataMode= the most common valueRange = difference between highest and lowest value
Example: If we calculated the height of a class and we found:
In cm: 160, 162, 164, 164, 165, 165, 165, 180, 190Mean = (160+160+162+163+164+164+165+165+165+180+190)/9 = 167Median = 160+160+162+163+164+164+165+165+165+180+190 = 164Mode= 160+160+162+163+164+164+165+165+165+180+190 =165Range= 190 – 160 =30
If you are still confused about how to calculate the mean, median and mode,watch this 4min video on YouTube: http://www.youtube.com/watch?v=k3aKKasOmIw
![Page 16: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/16.jpg)
16
Mean is generally used, unless extreme values (outliers) exist
Then median is often used, since the median is not sensitive to extreme values. Example: Median home prices may be
reported for a region – less sensitive to outliers
Which measure of location is the “best”?
![Page 17: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/17.jpg)
17
Range
Simplest measure of variationDifference between the largest and the
smallest values in a set of data:
Range = Xlargest – Xsmallest
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Example:
![Page 18: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/18.jpg)
18
Ignores the way in which data are distributed
Sensitive to outliers
7 8 9 10 11 12Range = 12 - 7 =
5
7 8 9 10 11 12Range = 12 - 7 = 5
Disadvantages of the Range
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 5 - 1 = 4
Range = 120 - 1 = 119
![Page 19: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/19.jpg)
Getting from the real world to a distribution
When we collect data from the ‘real world’ we need to then represent it in numerically and graphically useful ways. This is where graphical analysis and numerical statistical analysis are helpful.
Say we went into one classroom and observed 22 students with the following reading and mathematics scores.
To help understand the distribution of performance in this class we will calculate the mean, median and mode and also create a histogram of the data. (Do UDM Tut1) UDM Tutorial 1 – Mean, median, mode
student_idreading_sco
re math_score1 508 4832 437 4543 378 4544 355 4695 388 3536 378 4397 399 4398 437 4549 447 469
10 355 45411 399 42412 490 48313 437 46914 419 35315 516 53516 456 43917 525 52218 447 35319 437 45420 456 45421 456 42422 551 454
![Page 20: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/20.jpg)
Mean Median Mode
![Page 21: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/21.jpg)
Create a histogram
To create a histogram. Ensure that your analysis module in Excel is enabled
FileOptionsAdd-InsAnalysis ToolPak (click Analysis ToolPak and click “Go” at the bottom
Under the “Data” tab in Excel you should now have a button which says “Data Analysis” on the far right
Click “Data Analysis” Click “Histogram” Highlight the reading marks for input rangehighlight the Bin ranges for bin rangeClick OK
Relabel the Bin ranges 0-299, 300-399, 400-449 and so on. Insert graph.If you are still confused about how to create a histogram in Excel watch this 4min video on YouTube: http://www.youtube.com/watch?v=RyxPp22x9PU
![Page 22: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/22.jpg)
The normal distribution
In a perfect normal distribution the mean, median and mode are equal to each other – 75 here.
![Page 23: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/23.jpg)
Skewness
Negative/Left skew
Positive/Right skew
TIP: To remember if it is positive skew or negative skew, think of the distribution like a door-stop. Does the door touch the positive side or the negative side of the distribution?
![Page 24: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/24.jpg)
24
Shape of a Distribution
Describes how data are distributedMeasures of shape
Symmetric or skewed
Mean = Median Mean < Median Median < Mean
Right-SkewedLeft-Skewed Symmetric
![Page 25: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/25.jpg)
Positive and negative skew
![Page 26: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/26.jpg)
Example question
For this graph will: The mean > mode? The median <
mean? The mean = mode? The mean =
median?
![Page 27: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/27.jpg)
Example question
For this graph will: The mean > mode? The median <
mean? The mean = mode? The mean =
median?
The “highest” point in the distribution is always the mode…
![Page 28: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/28.jpg)
Tutorial quiz 1
Go to http://quizstar.4teachers.org/indexs.jsp Enter your username and passwordClick on “Basic Stats 101” Quiz and complete the
quizIf you have any questions raise your hand and I will
come and help you
For those not already registered you can register as a student on http://quizstar.4teachers.org/indexs.jsp and then search for my class ”UDM Msc Education” anyone can join the class
![Page 29: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/29.jpg)
End of Lecture 1
For questions email me at [email protected]
All slides/tutorials available at www.nicspaull.com/teaching
![Page 30: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/30.jpg)
30
Exploratory Data Analysis
Box-and-Whisker Plot: A Graphical display of data using 5-number summary:
Minimum -- Q1 -- Median -- Q3 -- Maximum
Example:
Minimum 1st Median 3rd Maximum Quartile Quartile
Minimum 1st Median 3rd Maximum Quartile Quartile
25% 25% 25% 25%
![Page 31: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/31.jpg)
31
Shape of Box-and-Whisker Plots
The Box and central line are centered between the endpoints if data are symmetric around the median
A Box-and-Whisker plot can be shown in either vertical or horizontal format
Min Q1 Median Q3 Max
![Page 32: UDM MSC COURSE IN EDUCATION & DEVELOPMENT 2013 NicholasSpaull@gmail.comNicholasSpaull@gmail.com –](https://reader031.vdocument.in/reader031/viewer/2022032311/56649dac5503460f94a9b3a0/html5/thumbnails/32.jpg)
32
Distribution Shape and Box-and-Whisker Plot
Right-SkewedLeft-Skewed Symmetric
Q1 Q2Q3 Q1Q2Q3 Q1 Q2 Q3