descriptive statistics - wordpress.com · descriptive statistics maths 4th eso josÉ jaime noguera...
TRANSCRIPT
![Page 1: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/1.jpg)
DESCRIPTIVE STATISTICS
MATHS 4TH ESO
JOSÉ JAIME NOGUERA
1
![Page 2: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/2.jpg)
INTRODUCTION
Statistics is used to collect, organize, analyze and present data.
• POPULATION: the whole group of entities (individuals) that you want to study.
• SAMPLE: a small subset of the population, that represents the entire population.
2
![Page 3: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/3.jpg)
3
![Page 4: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/4.jpg)
EXAMPLE
Spanish general election:
WHICH PARTY ARE YOU GOING TO VOTE?
– POPULATION: all Spanish citizen over 18.
– SAMPLE: the people who you really ask the question.
4
![Page 5: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/5.jpg)
Random Variables
A random variable or statistical variable is the characteristic that we want to study in the population.
According to the answer to your question you can classify the variables as:
Random variable
QUALITATIVE Answer=not a
number
QUANTITATIVE Answer= a number
DISCRETE Answer=a integer
number
CONTINUOUS Answer= a decimal
number 5
![Page 6: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/6.jpg)
Examples
• We want to study the number of brothers or sisters of a population.
– Discrete quantitative variable.
• We want to study the hair colour of a population.
– Qualitative variable.
• We want to study the height of a population.
– Continuous quantitative variable.
6
![Page 7: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/7.jpg)
Organizing data: frequency tables and charts
7
![Page 8: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/8.jpg)
A complete example for a discrete quantitative variable
Study: we ask 25 students the number of brothers and sisters. The answers are:
1, 3 , 0, 1, 2 , 2 , 0, 1 , 2 , 1 , 2, 1, 1, 0, 1, 0, 1, 0, 1, 1, 3, 2, 2, 1 ,1
8
![Page 9: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/9.jpg)
YOU NEED TO KNOW
• N = the number of data.
• 𝑥𝑖 = the value of the variable number i.
• 𝑓𝑖 = absolute frequency, is the number of times that 𝑥𝑖 appears in the answers.
• ℎ𝑖 = relative frequency = 𝑓𝑖
𝑁
• 𝐹𝑖 = absolute cumulative frequency= 𝑓1 + 𝑓2 +⋯𝑓𝑖
• 𝐻𝑖 = relative cumulative frequency= ℎ1 + ℎ2 +⋯ℎ𝑖=𝐹𝑖
𝑁
• 𝐻𝑖 as % = relative cumulative frequency as a percentage= 𝐻𝑖 · 100 %
9
![Page 10: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/10.jpg)
1, 3 , 0, 1, 2 , 2 , 0, 1 , 2 , 1 , 2, 1, 1, 0, 1, 0, 1, 0, 1, 1, 3, 2, 2, 1 ,1
𝒙𝒊 𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %
0 5 5/25=0.2 5 5/25=0.2 0.2·100=20%
1 12 12/25=0.48 5+12=17 17/25=0.68 0.68·100=68%
2 6 6/25=0.24 5+12+6=23 23/25=0.92 0.92·100=92%
3 2 2/25=0.08 5+12+6+2=25 25/25=1 1·100=100%
N=25
FREQUENCY TABLE
10
![Page 11: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/11.jpg)
CHARTS
0
2
4
6
8
10
12
14
0 1 2 3
BAR CHART
Absolute frequency
11
![Page 12: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/12.jpg)
CHARTS
0
2
4
6
8
10
12
14
0 1 2 3
FREQUENCY POLIGON
Absolute frequency
12
![Page 13: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/13.jpg)
CHARTS
0 20%
1 48%
hi·360º=172.8º
2 24%
3 8%
PIE CHART
13
![Page 14: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/14.jpg)
A complete example for a continuous quantitative variable
If we have too many different values of xi we have to group the values into class intervals.
Example: We know the weight of 30 students:
52 63 71 68 72 69
73 81 53 80 71 72
77 61 83 78 55 60
73 53 66 90 80 96
67 70 82 83 71 61
14
![Page 15: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/15.jpg)
Choosing the length (or amplitude) of the intervals
Here we have several options
• If the problem says the number of intervals, for instance, “group the data into 6 intervals”
𝑙𝑒𝑛𝑔𝑡ℎ =𝑀𝑎𝑥.𝑉𝑎𝑙𝑢𝑒−𝑀𝑖𝑛.𝑉𝑎𝑙𝑢𝑒
6
• If the problem says nothing:
𝑙𝑒𝑛𝑔𝑡ℎ =𝑀𝑎𝑥. 𝑉𝑎𝑙𝑢𝑒 −𝑀𝑖𝑛. 𝑉𝑎𝑙𝑢𝑒
𝑁
15
![Page 16: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/16.jpg)
In our case the problem says nothing, therefore:
𝑙𝑒𝑛𝑔𝑡ℎ =96 − 52
30= 8.03
The length should be an integer. We always choose the higher integer 8.03 → 𝑙𝑒𝑛𝑔ℎ𝑡 = 9
The first number of the first interval is also confusing. Sometimes the intervals are centered on the data, but we will simply choose the minimum value of our data.
16
![Page 17: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/17.jpg)
52 63 71 68 72 69
73 81 53 80 71 72
77 61 83 78 55 60
73 53 66 90 80 96
67 70 82 83 71 61
Intervals Class Mark 𝒙𝒊
𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %
[52,61)
[61,70)
[70,79)
[79,88)
[88,97]
Pay attention! 17
![Page 18: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/18.jpg)
52 63 71 68 72 69
73 81 53 80 71 72
77 61 83 78 55 60
73 53 66 90 80 96
67 70 82 83 71 61
Intervals Class Mark 𝒙𝒊
𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %
[52,61) 52 + 61
2= 56.5
[61,70) 65.5
[70,79) 74.5
[79,88) 83.3
[88,97] 92,5
18
![Page 19: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/19.jpg)
52 63 71 68 72 69
73 81 53 80 71 72
77 61 83 78 55 60
73 53 66 90 80 96
67 70 82 83 71 61
Intervals Class Mark 𝒙𝒊
𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %
[52,61) 56.5 5
[61,70) 65.5 7
[70,79) 74.5 10
[79,88) 83.3 6
[88,97] 92,5 2
19
![Page 20: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/20.jpg)
52 63 71 68 72 69
73 81 53 80 71 72
77 61 83 78 55 60
73 53 66 90 80 96
67 70 82 83 71 61
Intervals Class Mark 𝒙𝒊
𝒇𝒊 𝒉𝒊 𝑭𝒊 𝑯𝒊 𝑯𝒊 as %
[52,61) 56.5 5 5/30=0.16 5 5/30=0.16 16%
[61,70) 65.5 7 0.23 5+7=12 12/30=0.4 40%
[70,79) 74.5 10 0.33 22 0.73 73%
[79,88) 83.3 6 0.2 28 0.93 93%
[88,97] 92,5 2 0.06 30 1 100%
N=30 20
![Page 21: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/21.jpg)
HISTOGRAM
0
2
4
6
8
10
12
52 61 70 79 88 97
21
![Page 22: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/22.jpg)
Exercise
In a clothing store, the number of garments sold per day is:
a) Make a frequency table grouping the data into 6 class intervals.
b) Draw the proper chart.
22
![Page 23: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/23.jpg)
Statistical Parameters
23
![Page 24: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/24.jpg)
STATISTICAL CONCENTRATION PARAMETERS
They are also known as CENTRAL TENDENCY MEASURES:
• MEAN: 𝑥 =𝑥1·𝑓1+𝑥2·𝑓2+⋯+𝑥𝑛·𝑓𝑛
𝑁=
𝑥𝑖·𝑓𝑖𝑛𝑖=1
𝑁
• MODE: Mo is the 𝑥𝑖 with the greatest 𝑓𝑖
• MEDIAN: Me is the value in the middle of the data when they are in order.
24
![Page 25: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/25.jpg)
Discrete quantitative variable
• Using the data of our previous example:
𝑥 =0·5+1·12+2·6+3·2
25=1.2
Mo= 1 (because its 𝒇𝒊 is the greatest one)
Me= 1
Because the 50% of N=25 is 0.5·25=12.5, then, the first 𝐹𝑖 greater than or equal to 12.5 is 17=𝐹2 which corresponds with 𝑥2=1
𝒙𝒊 𝒇𝒊 𝑭𝒊
0 5 5
1 12 17
2 6 23
3 2 25
N=25
25
![Page 26: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/26.jpg)
Continuous quantitative variable
• Using the data of our previous example:
𝑥 =56.5·5+65.5·7+74.5·10+83.3·6+92.5·2
30
= 72.36 Mo= 74.5, or modal interval= [70,79)
Me= 74.5, or median class interval=[70,79)
Because the 50% of N=30 is 0.5·30=15, then, the first 𝐹𝑖 greater than or equal to 15 is 22=𝐹3 which corresponds with 𝑥3=74.5
Intervals 𝒙𝒊 𝒇𝒊 𝑭𝒊
[52,61) 56.5 5 5
[61,70) 65.5 7 12
[70,79) 74.5 10 22
[79,88) 83.3 6 28
[88,97] 92.5 2 30
N= 30
26
![Page 27: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/27.jpg)
STATISTICAL POSITION PARAMETERS
• QUARTILES: are the points that divide the data into four equal parts:
– 𝑄1: first quartile. Below 𝑄1 are the 25% of the data.
– 𝑄2 = Me. Below 𝑄2 = 𝑀𝑒 are the 50% of the data.
– 𝑄3: third quartile. Below 𝑄3 are the 75% of the data.
• PERCENTILES, 𝑃𝑘 , below it are the k% of the data.
27
![Page 28: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/28.jpg)
Discrete quantitative variable • Using the data of our previous example:
𝒙𝒊 𝒇𝒊 𝑭𝒊
0 5 5
1 12 17
2 6 23
3 2 25
N=25
• 𝑄1 → 0.25 · 25 = 6.25 The first 𝐹𝑖 greater than or equal to 6.25 is 17=𝐹2 which corresponds with 𝑥2=1. Hence 𝑄1=1
• 𝑄2 → 0.5 · 25 = 12.5 The first 𝐹𝑖 greater than or equal to 12.5 is 17=𝐹2 which corresponds with 𝑥2=1. Therefore 𝑄2=1=Me
• 𝑄3 → 0.75 · 25 = 18.75 The first 𝐹𝑖 greater than or equal to 18.75 is 23=𝐹3 which corresponds with 𝑥3=2. Then 𝑄3=2
• 𝑃95 → 0.95 · 25 = 23.75 The first 𝐹𝑖 greater than or equal to 23.75 is 25=𝐹4 which corresponds with 𝑥3=3. Hence 𝑃95=3
28
![Page 29: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/29.jpg)
Continuous quantitative variable • Using the data of our previous example:
• 𝑄1 → 0.25 · 30 = 7.5 The first 𝐹𝑖 greater than or equal to 7.5 is 12=𝐹2 which corresponds with 𝑥2=65.5. Hence 𝑄1=65.5
• 𝑄2 → 0.5 · 30 = 15 The first 𝐹𝑖 greater than or equal to 15 is 22=𝐹2 which corresponds with 𝑥3=74.5. Therefore 𝑄2=Me=74.5
• 𝑄3 → 0.75 · 30 = 22.5 The first 𝐹𝑖 greater than or equal to 28 is 28=𝐹4 which corresponds with 𝑥4=83.3. Then 𝑄3=83.3
• 𝑃30 → 0.30 · 30 = 9 The first 𝐹𝑖 greater than or equal to 9 is 12=𝐹2 which corresponds with 𝑥2=65.5. Hence 𝑃30=65.5
Intervals 𝒙𝒊 𝒇𝒊 𝑭𝒊
[52,61) 56.5 5 5
[61,70) 65.5 7 12
[70,79) 74.5 10 22
[79,88) 83.3 6 28
[88,97] 92.5 2 30
N= 30
29
![Page 30: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/30.jpg)
Box and whisker plot
30
![Page 31: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/31.jpg)
Discrete quantitative variable
• We know that – Minimum value=0
– 𝑄1=1
– 𝑄2 = 𝑀𝑒=1
– 𝑄3=2
– Maximum value=3
31
![Page 32: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/32.jpg)
Continuous quantitative variable
• We know that – Minimum value = 52
– 𝑄1= 65.5
– 𝑄2 = 𝑀𝑒= 74.5
– 𝑄3= 83.3
– Maximum value = 97
32
![Page 33: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/33.jpg)
Improving the quartiles calculus
Once you know the basics of quartiles let’s see an special case:
• Calculate the quartiles :
1 , 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5
The frequency table is:
33
![Page 34: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/34.jpg)
𝒙𝒊 𝒇𝒊 𝑭𝒊
1 3 3
2 1 4
3 2 6
4 3 9
5 3 12
N=12
If you want to calculate the mean: • 𝑄2 → 0.5 · 12 = 6 The first 𝐹𝑖 greater than
or equal to 6 is 6=𝐹3 which corresponds with 𝑥3=3. Therefore 𝑄2=3=Me
But this is unreal because if we see the data the mean should be:
1, 1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5
Me=3+4
2= 3.5
To solve this drawback, we simply, calculate the Mean (or any other quartile) as 𝑥𝑖 + 𝑥𝑖+1
2
when we find a 𝐹𝑖 exactly the same as 0.5·N (or k% of N). In other cases we calculate the quartiles as usual. Hereinafter we will calculate the quartiles as has been explained in this slide.
34
![Page 35: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/35.jpg)
Dispersion (spread) Parameters
• Range: R=Max. value-Min value.
• Average Deviation: 𝐷𝑥 =𝑓1 𝑥1−𝑥 +𝑓2 𝑥2−𝑥 +⋯+𝑓𝑛 𝑥𝑛−𝑥
𝑁
• Variance: 𝜎2 =𝑓1 𝑥1−𝑥
2+𝑓2 𝑥2−𝑥 2+⋯+𝑓𝑛 𝑥𝑛−𝑥
2
𝑁
• Standard deviation: 𝜎 = 𝜎2
• Coefficient of variation: CV=𝜎
𝑥
35
![Page 36: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/36.jpg)
Example
We know that:
𝑥 = 1.2
• Range: R=3-0=3
• Average Deviation: 𝐷𝑥 =5 0−1.2 +12 1−1.2 +6 2−1.2 +2 3−1.2
25 = 0.67
• Variance: 𝜎2 =5 0−1.2 2+12 1−1.2 2+6 2−1.2 2+2 3−1.2 2
25= 0.72
• Standard deviation: 𝜎 = 𝜎2 = 0.86 = 0.85
• Coefficient of variation: CV=𝜎
𝑥 =
0.85
1.2= 0.6
36
![Page 37: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/37.jpg)
Exercise
• Calculate the spread parameters:
𝑥 = 72.36
37
![Page 38: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/38.jpg)
Interpreting the spread measures
A 𝑥 = 3
𝜎 = 1.03
𝐶𝑉 =𝜎
𝑥 = 0.34
B 𝑥 = 3
𝜎 = 1.68
𝐶𝑉 =𝜎
𝑥 = 0.56
C 𝑥 = 30 𝜎 = 16.8
𝐶𝑉 =𝜎
𝑥 = 0.56
• A and B have the same mean but the data dispersion is greater in B because its 𝜎 is greater in B (also de CV)
• The CV is useful when we compare two sets of data when the units are different. • In A and B the CV contains the same information as 𝜎 because the units are the
same (1,2,3,4,5), but if we compare B and C the 𝜎 is not useful because it seems that in C the data dispersion is greater (𝜎 is greater) . But that is not true because the units are different, so we have to use the CV . In fact, the data spread in B and C is the same (because they have the same CV)
38
![Page 39: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/39.jpg)
Dispersion diagrams
39
![Page 40: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/40.jpg)
Dispersion diagrams
If we have pairs of data (𝑥𝑖 , 𝑦𝑖) and we plot them, we obtain a dispersion diagram.
Example:
40
![Page 41: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/41.jpg)
Correlation • If the point cloud is near a line, then exist linear correlation.
• In other types of curves, there is correlation but is nonlinear.
• If the point cloud is not near any curve, then there is no correlation.
41
![Page 42: DESCRIPTIVE STATISTICS - WordPress.com · DESCRIPTIVE STATISTICS MATHS 4TH ESO JOSÉ JAIME NOGUERA 1 . INTRODUCTION Statistics is used to collect, organize, analyze and present data](https://reader035.vdocument.in/reader035/viewer/2022071003/5fc07792d8a78443d0002fea/html5/thumbnails/42.jpg)
Example • In a laboratory, we give to some mice three medicaments A, B, C in order
to cure a disease. We plot the quantity of substance (X axis) vs the number of dead mice (Y axis). We plot the dispersion diagrams associated with
each substance:
A B C
• In A there is linear positive correlation, as X increases Y tends to increase. This is not a good choice because the medicament kills the mice.
• In B there is linear negative correlation. As X increase, Y decreases. B is a good medicament because reduce the number of dead mice.
• In C there is no correlation. C does not have anything to do with the disease
42