s1 revision notes

7
Revision Notes Binomial Distribution Hypothesis Tester 0.00 0.05 0.10 0.15 0.20 0.25 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 r P( X = r ) Accept H0 Reject H0 Bob Francis 2005

Upload: willmead

Post on 20-Oct-2015

33 views

Category:

Documents


0 download

DESCRIPTION

Revision notes for S1 Module

TRANSCRIPT

Page 1: s1 Revision Notes

Revision Notes

Binomial Distribution Hypothesis Tester

0.00

0.05

0.10

0.15

0.20

0.25

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20r

P(X

= r

)

Accept H0 Reject H0

Bob Francis 2005

Page 2: s1 Revision Notes

2 Revision Notes Statistics 1

Data PresentationTopic Examples References Classifying data Categorical: non-numerical categories, e.g. favourite colours of 30 children, political party voted for by 1000 electors. Discrete: numerical taking particular, often integral, values, e.g. number of goals, shoe size Continuous: numerical values measured to a given accuracy, e.g. length, weight, time, speed.

Categorical: Political parties: Conservative, Labour, Liberal Democrats, Greens, etc. Discrete: Goals scored in consecutive games: 3, 3, 0, 4, 2, 1, 2, 1, 0, 2, 3, 3, 4, 2, 1, 2, 2, 3, 1, 2 Continuous: Heights measured to nearest cm: 181, 178, 160, 182, 166, 169, 174, 159, 180, 177, 177, 182, 173, 174, 161, 177, 185, 166, 166, 186.

MEI Stats 1 Pages 12 and 13 Categorical data

Discrete data Continuous data

Pages 22 to 23 Grouped data

Frequency Distributions Categorical: frequencies of various non-numerical categories, e.g. % supporting political parties. Discrete: frequencies of discrete values, e.g. goals scored by one team in 20 consecutive games. Continuous: frequencies of continuous values in class intervals with associated boundaries, e.g. no. students with heights measured to nearest 5 cm. Grouped discrete data can be treated as if it were continuous, e.g. distribution of marks in a test.

Categorical: Conservative 23% , Labour 42%, Liberal Democrats 18%, Greens 7%, rest 10%

MEI Stats 1 Pages 17 to 19

Frequency distributions

Pages 24 to 26 Grouped data

Displaying Frequency Distributions Categorical: Use a bar chart (or pie chart) with heights (or angles) proportional to frequencies. Discrete: Use a vertical line chart with heights proportional to frequencies. Continuous: Use a histogram with equal or unequal class intervals; area of rectangle proportional to frequency; height of rectangle gives frequency density. Use a cumulative frequency curve to plot cumulative frequencies against upper class boundaries for continuous (or grouped discrete) data. Interpretation: median, IQR, percentiles.

MEI Stats 1 Pages 56 to 58 Bar charts and

vertical line charts

Pages 62 to 69 Histograms

Pages 74 to 77 Cumulative

frequency curves

Stem and leaf diagrams Concise way of displaying discrete or continuous data (measured to a given accuracy) whilst retaining the original information. Data usually sorted in ascending order. Interpretation: 'Shape' of distribution; mode, median and quartiles.

MEI Stats 1 Pages 6 to 8

Stem and leaf diagrams

Box and whisker plots Simple way of displaying median, inter-quartile range and range for discrete or continuous data. Interpretation: Comparison of two distributions; medians, IQRs and ranges.

MEI Stats 1 Pages 73 and 74

Box & whisker plots

Skewness A frequency distribution for discrete or continuous data may exhibit symmetry, positive skew or negative skew, according to its 'shape'.

The discrete frequency distribution example (goals scored by a football team) is roughly symmetrical. The stem and leaf example (distribution of marks) exhibits negative skewness. The distribution of lengths of telephone calls may well exhibit positive skewness, peaking well to the left of the mid-range.

MEI Stats 1 Pages 5 and 6

Shapes of distributions

Page 3: s1 Revision Notes

Statistics 1 Revision Notes 3

Central Tendency and DispersionTopic Examples References

Central Tendency [‘averages’] Mode: most frequently occurring value in a data

set; a frequency distribution with a single mode is unimodal; a frequency distribution with two distinct modes is bimodal.

Midrange: (minimum + maximum) value ÷ 2 Median: middle value when data arranged in order; for even total frequency average the middle 2 values Mean: (total of all data values) ÷ (total frequency) x = x

nΣ (raw data) or xf

fΣΣ

(frequency distribution)

MEI Stats 1 Pages 13 to 16

Central Tendency Pages 24 to 27 Grouped data

Quartiles and Percentiles Lower (Q1) and upper (Q3) quartiles: values ¼

way and ¾ way through the distribution. Percentile: The rth percentile is the value r/100

way through the distribution.

MEI Stats 1 Pages 71 and 72

Quartiles

Dispersion [‘spread’] Range: maximum value – minimum value Inter quartile range (IQR):

(upper quartile – lower quartile) = Q3 – Q1Sum of squares: Sxx = 2( )x xΣ − ≡ Σx2 – n 2x (raw data)

Sxx = 2( )x x fΣ − ≡ Σx2f – n 2x (frequency dist.)

Mean square deviation: xxSn

rmsd: xxSn

Variance: 1

xxSn −

Standard deviation: s=1

xxSn −

MEI Stats 1 Pages 31 to 40

Range Sum of squares

Root mean square deviation Standard deviation

Page 73 Inter-Quartile

Range

Using a calculator Make sure that you can use a scientific or graphical calculator to find the mean [ x ], root mean square deviation, rmsd [σn] and standard deviation, s [σn-1] of a raw data set and a frequency distribution.

Graphical calculator

Data Analysis for the TI-83 +

accompanying notes

Outliers Can be applied to data which are: (a) at least 2 standard deviations from the mean i.e. beyond x ± 2s (b) at least 1.5 × IQR beyond the nearer quartile i.e. below Q1 – 1.5×IQR or above Q3 + 1.5×IQR

Raw Data Heights measured to nearest cm: 159, 160, 161, 166, 166, 166, 169, 173, 173, 174, 177, 177, 177, 178, 180, 181, 182, 182, 185, 196. Modes = 166 and 177 (i.e. data set is bimodal) Midrange = (159 + 196) ÷ 2 = 177.5 Median = (174 + 177) ÷ 2 = 175.5

Mean: 347220

xxn

Σ= = = 174.1

Range = 196 – 159 = 37 Lower quartile Q1 = 166 Upper quartile Q3 = 180.5 Inter-quartile range (IQR) = 180.5 – 166 = 14.5 Sum of squares:

Sxx = Σx2 – n 2x = 607886 – 20 ×174.12 = 1669.8 Root mean square deviation: rmsd = xxS

n= 1669.8

20 = 9.14 (3 s.f.)

Standard deviation: s =1

xxSn −

= 1669.819

= 9.37 (3 s.f.)

Outliers (a): 174.1 ± 2× 9.37 = 155.36 or 192.84 - the value 196 lies beyond these limits, so one outlier Outliers (b): 166 – 1.5× 14.5 = 144.25 180.5 + 1.5× 14.5 = 202.25 - no values lie beyond these limits, so no outliers Frequency Distribution Goals scored by one team in 20 consecutive games:

Mode = 2 Midrange = (0 + 4) ÷ 2 = 2 Median = 2 (goals scored in 10th and l1th matches)

Mean = 4120

xfxxf

Σ= =

Σ = 2.05

Lower quartile Q1 = 1 Upper quartile Q3 = 3 Range = 4 – 0 = 4 Inter-quartile range (IQR) = 3 – 1 = 2 Sum of squares:

Sxx = Σx2f – n 2x = 109 – 20 ×2.052 = 24.95 Root mean square deviation: rmsd = xxS

n= 24.95

20 = 1.12 (3 s.f.)

Standard deviation: s =1

xxSn −

= 24.9519

= 1.15 (3 s.f.)

Outliers (a): 2.05 ± 2× 1.15 = – 0.25 or 4.35 - no values lie beyond these limits, so no outliers

Outliers (b): 1– 1 .5 × 2 = – 2; 3 + 1 .5 × 2 = 6 - no values lie beyond these limits, so no outliers

Goals scored (x) 0 1 2 3 4 Frequency (f) 2 4 7 5 2

MEI Stats 1 Pages 40 and 41

Outliers & s.d.

Pages 73 and 74 Outliers & IQR

Coding If y = ax + b then:

y = a x + b and sy = a sx

For data sets x and y : y = 5x – 20: Given x = 24.8 and sx = 7.3: y = 5 × 24.8 – 20 = 102 and sy = 5 × 7.3 = 36.5

MEI Stats 1 Pages 43 to 45

Linear coding

Page 4: s1 Revision Notes

4 Revision Notes Statistics 1

Probability 1Topic Examples References Probability of events Probability describes the likelihood of an event occurring in a statistical experiment. Probability is measured on a scale of 0 to 1:

The theoretical probability of an event A is given by P(A) = ( )

( )n An ξ

where A is the set of

favourable outcomes and ξ is the set of all possible outcomes. The experimental probability of an event is:

number of successesnumber of trials

The complementary event of A is given by A' and is defined as the set of possible outcomes not in set A. Hence P(A') = 1 – P(A) The expectation (expected frequency) of an event is the number of times it is expected to occur in n repetitions of the experiment, and is given by:

Expected frequency = n × P(A) Sample space The sample space for an experiment illustrates the set of all possible outcomes. An event is therefore a sub-set of the sample space. Probabilities can be calculated from first principles.

MEI Stats 1 Pages 87 to 91

Measuring probability

Experimental and theoretical

probability The complement

of an event Expectation (expected frequency)

Addition rule for probability For any two events A and B:

P(A or B) = P(A) + P(B) – P(A and B) or P(A ∪ B) = P(A) + P(B) – P(A ∩ B)

Events A and B are mutually exclusive if they cannot happen simultaneously, i.e. the occurrence of one event excludes the occurrence of the other event,

=> P(A ∩ B) = 0 Addition rule for mutually exclusive events:

P(A or B) = P(A ∪ B) = P(A) + P(B)

MEI Stats 1 Pages 92 to 94

Probability of one event or

another

Pages 94 and 95

Mutually exclusive events

Multiplication rule for probability For any two (dependent) events A and B:

P(A and B) = P(A ∩ B) = P(A) × P(B|A) Events A and B are independent if the occurrence of one has no effect on the occurrence of the other,

⇒ P(B|A) = P(B) and P(A|B) = P(A) Multiplication rule for independent events:

P(A and B) = P(A ∩ B) = P(A) × P(B)

Experimental Probability In a statistical experiment a drawing pin is thrown 100 times, landing point-down 36 times. The probability of event A (the drawing pin landing point down) may be estimated as:

P(A) ≈ 37100 = 0.37

⇒ P(A') = 1 – P(A) = 1 – 0.37 = 0.63

Theoretical Probability An ordinary pack of cards is shuffled and a card chosen at random. The probability of event A (card chosen is a picture card) is calculated by:

P(A) = 1252 = 3

13

⇒ P(A') = 1 – P(A) = 1 – 313 = 10

13 If the experiment is repeated 100 times, then the expectation (expected frequency) of a picture card being chosen

= n × P(A) = 100 × 1013 = 76.9 (to 3 s.f.)

Sample Space Two fair dice are thrown and their scores added.

+ 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 10 11 12

Event A (Total = 7): P(A) = 636 = 1

6

Event B (Total > 8): P(B) = 1036 = 5

18

Non-mutually exclusive events An ordinary pack of cards is shuffled and a card chosen at random. Event A (card chosen is a picture card): P(A) = 12

52

Event B (card chosen is a ‘heart’): P(B) = 1352

Since P(card is a picture ‘heart’) = P(A ∩ B) = 352 :

P(A ∪ B) = P(A) + P(B) – P(A ∩ B) = 12

52 + 1352 – 3

52

= 2252 = 11

26

Mutually exclusive events Two fair dice and thrown and their scores are added. Event A (Total = 7): P(A) = 6

36 = 16

Event B (Total > 8): P(B) = 1036 = 5

18

Since P(Total = 7 and > 8) = P(A ∩ B) = 0:

P(A ∪ B) = P(A) + P(B) = 6

36 + 1036

= 1636 = 4

9

MEI Stats 1 Pages 109 to 110

Dependent and independent

events

A B

A B

Page 5: s1 Revision Notes

Statistics 1 Revision Notes 5

Probability 2Topic Examples References Tree diagrams A useful way of illustrating probabilities for both independent and dependent events. Multiply probabilities along the branches (and); Add probabilities at the ends of branches (or). Independent events:

Dependent events:

Tree diagrams may have more than two branches at each division and/or more than two sets. Tree diagrams may be asymmetrical.

MEI Stats 1 Pages 98 to 101 The probability of events from

two trials

Conditional Probability The multiplication law for dependent probabilities: may be rearranged to give:

P(B | A) = P( )P( )A B

A∩ or P(A | B) = P( )

P( )A B

B∩

If event A logically precedes event B then the right-hand version is useful for calculating posterior conditional probability.

MEI Stats 1 Pages 107 to 113

Conditional probability

Independent events A child's toy has two parts; 90% of top parts and 75% of bottom parts are perfect. Parts are placed together at random. Event A (top part is perfect): P(A) = 0.9 Event B (bottom part is perfect): P(B) = 0.75

⇒ P(A ∩ B) = P(A) × P(B) = 0.9 × 0.75 = 0.675

Dependent events A pack of cards is shuffled; two cards are chosen at random without replacement. Event A (1st card is a picture card): P(A) = 12

52

Event B (2nd card is a picture card): P(B | A) = 1151

⇒ P(A ∩ B) = P(A) × P(B | A) = 1252 × 11

51 = 11221

⇒ P(B) = 12

52 × 1151 + 40

52 × 1251 = 3

13

The conditional probability of "A given B" is:

P(A | B) = P( )P( )A B

B∩ =

11221

313

= 1151

Combinations and Probability The number of ways of arranging n distinct objects in order is n!, where n! = n × (n – 1) × ... × 3 × 2 ×1 [Special case: 0! = 1] The number of ways of choosing (or selecting) r from n distinct objects is nCr, where

nCr = = nr

⎛ ⎞⎜ ⎟⎝ ⎠

!!( )!

nr n r−

, for r = 0, 1, 2, …, n

Suppose that n distinct objects are divided into types S and T , where n(S) = n1 and n(T) = n2 and r objects are selected at random from the n objects. The probability that there are r1 of type S and r2 of type T is: where r1 + r2 = r and n1 + n2 = n

Choosing a tiddlywinks team A college Tiddlywinks Club has 17 members, 7 of whom are girls. A mixed team of 5 is chosen at random.

No. possible outcomes = 17C5 = 17!5! 12!×

= 6188

No. ways of choosing a team with exactly two girls

= 7C2 × 10C3 = 7!2! 5!×

× 10!3! 7!×

= 2520

Hence probability that the team, chosen at random, contains exactly two girls

= 25206188

= 0.407 (to 3 s.f.)

MEI Stats 1 Pages 139 to 140

Factorials and arrangements

Pages 143 to 146

Combinations

Binomial coefficients

Pages 147 to 149

Calculating probabilities in

less simple cases 1 2

1 2C C

C

n nr rn

r

×

Page 6: s1 Revision Notes

6 Revision Notes Statistics 1

Discrete Random VariablesTopic Examples References

Discrete random variables A discrete random variable X takes values:

x1, x2, x3, x4, …, xn

with probabilities: p1, p2, p3, p4, …, pn

where pi = P(X = xi) for i = 1, 2, 3, …, n and Σ pi = Σ P(X = xi) = 1

Illustrate using a vertical line chart:

Definition by formula: Sometimes it is possible to define the probability function as a formula, as a function of r:

P(X = r) = f(r) for values of r (usually integral)

Often the function f includes a constant, k, which can be found using the property Σ pi = 1

Definition by table: For a small set of values it is often convenient to list the associated probabilities pi for each xi

xi x1 x2 x3 …. xn – 1 xn

P(X = xi) p1 p2 p3 …. pn – 1 pn

Calculation of probabilities: Sometimes you need to be able to calculate the probability of some compound event, given the values from the table or function.

Explanation of probabilities: Often you need to explain how the probability P(X = xk), for some value of k, is derived from first principles.

MEI Stats 1 Pages 118 to 124

Definitions Notation

Vertical line charts

Calculation of probabilities

Expectation (mean) The expectation (or mean) of a discrete random variable is defined by:

E(X) = µ = Σ xiP(X = xi) = Σ xipi

MEI Stats 1 Pages 127 to 130 Expectation of a discrete random

variable

Variance The variance of a discrete random variable is defined by:

Var(X) = σ2 = E([X – µ]2) ≡ Σx2P(X = x) – µ2

⇒ Var(X) = σ2 = E(X2) – [E(X)]2

Definition by formula X is a discrete random variable given by:

P(X = r) = kr

for r = 1, 2, 3, 4

To find the value of k, use Σ P(X = xi) = 1:

Σ P(X = xi) = 1 2 3 4k k k k

+ + + = 1

⇒ 2512 k = 1 ⇒ k = 12

25 = 0.48

Vertical line chart:

Expectation and Variance: E(X) = µ = ΣrP(X = r) = 1×0.48 + 2×0.24 + 3×0.16 + 4×0.12 = 1.92 E(X2) = Σr2P(X = r) = 12×0.48 + 22×0.24 + 32×0.16 + 42×0.12 = 4.5 ⇒ Var(X) = E(X2) – [E(X)]2 = 4.5 – 1.922 = 0.8136 Definition by table In a competition, you have to match 4 inventors with 4 inventions. Assume this is done at random. Let X represent the number of correct matchings. The distribution is given by the table:

r 0 1 2 3 4

P(X = r) 38

13

14 0 1

24

Expectation and Variance: E(X) = µ = ΣrP(X = r) = 0× 3

8 + 1× 13 + 2× 1

4 + 3×0 + 4× 124 = 1

E(X2) = Σr2P(X = r) = 02× 3

8 + 12× 13 + 22× 1

4 + 32×0 + 42× 124 = 2

⇒ Var(X) = E(X2) – [E(X)]2 = 2 – 1 = 1

Calculation of probabilities: If two friends both enter the competition, the probability that both guess the same number of correct matchings

= 2 2 2 23 1 1 1

8 3 4 240+ + + +2

= 91288 ≈ 0.316 (3 s.f.)

Explanation of probabilities: Explanation of why P(X = 2) = 1

4 :

• Total number of possible matchings = 4! = 24 • One correct matching found in 4C2 = 6 ways

⇒ P(X = 2) = 624 = 1

4

MEI Stats 1 Pages 127 to 130

Variance of a discrete random

variable

Page 7: s1 Revision Notes

Statistics 1 Revision Notes 7

Binomial Distribution and Hypothesis Testing Topic Examples References Binomial Distribution B(n, p) A trial is defined to be 'success' or 'failure', where P('success') = p and P('failure') = q [= l – p]. A random sample consists of n independent trials. Let X represent the number of 'successes' in the random sample. Then X ~ B(n,p) and

P(X = r) = nCr prqn – r, for r = 0, 1, 2, . . . , n Mean (expected) number of 'successes' = np If m random samples of n independent trials are taken, then the expected frequency of r successes is given by m× P(X = r).

Left and right-handed people In a national survey, 12% of people are left-handed. In a

random sample of 15 people, let X represent the number of left-handed people.

Then X ~ B(15, 0.12) and P(X = r) = 15Cr×0.12r×0.8815–r

for r = 0, 1, ..., 15 P(X = 3) = 15C3×0.123×0.8812

= 0.170 (3 s.f.) P(X ≥ 1) = 1 – P(X = 0) = 1 – 0.8815 = 0.853 (3 s.f.) Mean (expected) no. left-handed = 15 × 0.12 = 1.8 If 50 random samples of 15 people are taken, then the

expected frequency of finding 3 left-handed people = 50 × P(X = 3) = 50 × 0.170 = 8.48 (to 3 s.f.)

MEI Stats 1 Pages 153 to 156

The binomial distribution

Pages 158 to 161 Expectation of

B(n, p) Using the binomial

distribution

Cumulative Binomial Probability Tables Binomial probabilities can be calculated using cumulative binomial probabilities, on pages 34 to 39 of the Students' Handbook. These tables give P(X ≤ x) for n = 1 to 20 and various values of p. The following is an extract for n = 20:

Throwing a fair die A fair die is thrown 20 times. Let X represent the number of sixes obtained. Then X ~ B(20, 1/6) and cumulative binomial

probability tables may be used:

P(X ≤ 5) = 0.8982

P(X = 4) = P(X ≤ 4) – P(X ≤ 3) = 0.7687 – 0.5665 = 0.2022

P(X > 6) = 1 – P(X ≤ 6) = 1 – 0.9629 = 0.0371

P(3 ≤ X ≤ 6) = P(X ≤ 6) – P(X ≤ 2) = 0.9629 – 0.3287 = 0.6342

MEI Stats 1 Pages 174 to 175

Cumulative binomial

probability tables

Hypothesis Testing A null hypothesis (H0) is tested against an alternative hypothesis (H1) at a particular significance level. According to given criteria, the null hypothesis is either rejected is not rejected. An hypothesis test can be either 1-tailed or 2-tailed. Hypothesis testing procedure (1) Establish null and alternative hypotheses:

H0: p =... H1: p < ... or p > ... (1-tail); p ≠ ... (2-tail)

(2) Decide on the significance level: s% (3) Collect data (independent and at random):

obtain r successes out of n trials. (4) Conduct test:

1-tail: H1: p < ... - compare P(X ≤ r) with s% 1-tail: H1: p > ... - compare P(X ≥ r) with s% 2-tail: H1: p ≠ ...

if r < mean (np) compare P(X ≤ r) with l/2 s% if r > mean (np) compare P(X ≥ r) with l/2 s%

(5) Interpret result in terms of the original claim: 1-tail: if P(X ≤ r) or P(X ≥ r) < s% reject H0

2-tail: if P(X ≤ r) or P(X ≥ r) <l/2 s% reject H0

Critical value and critical region The critical value is the least extreme value for which the null hypothesis (H0) is rejected. The critical region is the set of all values for which H0 is rejected.

One tail test Chris thinks that his die is biased against producing sixes. In 20 throws of the die he gets just 1 six. Hypothesis test to test Chris's claim at 5% level: (1) H0: p = 1/6; Hl: p < 1/6 (1-tail) (2) Decide on the significance level: 5% (3) Data collected: 1 six in 20 trials (4) Conduct test: P(X ≤ 1) = 0.1304 > 0.05 (5%) (5) Interpret result: Since P(X ≤ 1) > 5%, there is not

enough evidence to reject H0, i.e. accept that Chris's die is not biased against sixes.

Critical value and critical region: Since P(X ≤ 0) = 0.0261 < 0.05 (5%), X = 0 is the critical value and {0} is the critical region.

Two tail test A survey claims: "15% of population left-handed". Hypothesis test to test survey's claim at 10% level: (1) H0: p = 0.15; H1: p ≠ 0.15 (2-tail) (2) Decide on the significance level: 10% (3) Data collected: 7 LH in random sample of 20 (4) Conduct test: since 7 > mean (20 × 0.15 = 3), P(X ≥ 7) = 1 – P(X ≤ 6) = 1 – 0.9781 = 0.0219 (5) Interpret result: Since P(X ≥ 7) < 5%, there is

enough evidence to reject H0, i.e. do not accept 15% of the population are left-handed.

Critical region: Since P(X ≤ 0) = 0.0388 and P(X ≤ 1) = 0.1756, and P(X ≥ 6) = 0.0388 and P(X ≥ 7) = 0.0673, {X: x = 0 or x ≥ 7} is the critical region.

MEI Stats 1

Pages 169 to 173

Defining terms Hypothesis

testing checklist Choosing the

significance level

Pages 177 to 179 Critical values and critical regions

Pages 182 to 184

1-tail and 2-tail tests

Asymmetrical cases

Excel Spreadsheet

Binomial Distribution, Hypothesis Testing and

Critical Regions