scientific methods 1

44
16 Nov 2011 COMP80131-SEEDSM2-4 1 Scientific Methods 1 Barry & Goran ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 4: Statistical Methods- Probability www.cs.man.ac.uk/~barry/mydocs/ myCOMP80131

Upload: yvette-fitzpatrick

Post on 03-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Scientific Methods 1. ‘Scientific evaluation, experimental design & statistical methods’ COMP80131 Lecture 4: Statistical Methods-Probability. Barry & Goran. www.cs.man.ac.uk/~barry/mydocs/myCOMP80131. Probability. There are two useful definitions of probability: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 1

Scientific Methods 1

Barry & Goran

‘Scientific evaluation, experimental design

& statistical methods’

COMP80131

Lecture 4: Statistical Methods-Probability

www.cs.man.ac.uk/~barry/mydocs/myCOMP80131

Page 2: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 2

ProbabilityThere are two useful definitions of probability:

1. Baysian probability is a person’s belief in the truth of a statement S, quantified on a scale from 0 (definitely not true) to 1 (definitely true).

2. Experimental (or frequentist) probability is determined by the number, M, of times that a statement S will be found to be true if it is tested a large number, N, of times . The probability, P(S), may then be defined as the limit of M / N as more and more experiments are carried out & N tends to infinity.

Page 3: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 3

Different language

• By either definition, probability P(S) is a number in range 0 to 1.

• Multiply by 100 to express as a percentage.

• Or express as odds:

e.g. ‘4 to 1 against’ means 1/5 = 0.2 = 20%.

• What do odds of ‘4 to 1 on’ mean?

• What does ‘50-50’ mean ?

Page 4: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 4

Calculating probability• The 2 definitions of probability usually mean the same thing.

• By examining a coin, we could give ourselves good reason for believing that tossing it just once will give an even chance of getting heads, i.e. that

the Baysian definition of P(S) = 0.5 where S = ‘get heads’.

• If the coin is then tossed N = 100 times we would expect about M = 50 occurrences of heads meaning that M/N 0.5.

• Increasing N to 1000 and then to 1000000 would be expected to produce closer & closer approximations to P(S) = 0.5.

• If this does not happen, our ‘a-priori’ belief may be wrong. • The coin may be ‘weighted’ after all.

Page 5: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 5

Random process• Tossing a coin is a random process.

• It generates a ‘random variable’ Heads or Tails.

• It is random because the outcome cannot be predicted exactly.

• If 1= heads and 0 = tails we have a random binary number.

• Throwing a dice generates a random integer in range 1-6.

• Spinning a Roulette wheel generates a random no. in range 0-36.

• Setting & marking an exam produces random nos in range 0-100

• These are all random processes producing discrete variables.

• Some random processes produce continuous variables.

e.g. measuring people’s heights.

Page 6: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 6

Simulating random process

• MATLAB has functions that generate pseudo-random numbers.• ‘rand’ produces a pseudo-random number ‘uniformly distributed’

in the range 0 to 1.• May be considered ‘continuous’ since floating pt is very accurate.• Calling ‘rand’ repeatedly produces numbers evenly distributed

across the range 0 to 1.• They are ‘pseudo-random’ because if we know the algorithm

used, we can predict the numbers.• So we pretend we do not know the algorithm.• ‘rand’ may be considered to simulate some random process that

generates truly random numbers, uniformly distributed..

Page 7: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 7

Simulating coin tossing in MATLABfor n=1:20 R = rand; if R > 0.5, Heads(n)=1 else Heads(n) = 0; end;end; % of n loopHeads

10110001110101011101 - 12 heads & 8 tails

When I changed 20 to 10,000, I got 5066 heads: P(Heads) 0.5066

When I ran it again, I got 4918 heads : P(Heads) 0.4918

Page 8: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 8

Using an unfair coinfor n=1:20 R = rand; if R > 0.4, Heads(n)=1 else Heads(n) = 0; end;end; % of n loopHeads

00101001110101010101 - 10 heads & 10 tails

•When I changed 20 to 10,000, I got 6012 heads: P(Heads) 0.6012

•When I ran it again, I got 5979 heads : P(Heads) 0.5979

Page 9: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 9

Estimating probability experimentally

• We cannot measure probability with 100% accuracy. • All measurements are estimates that may be slightly or totally

wrong.• According to experimental definition, we have to perform an

experiment an infinite number of times to measure a probability. • This is clearly impossible. • In practice, we have to perform the experiment a finite number of

times• (Cannot spend all our lives tossing coins)• Accept resulting measurement as an estimate of true probability.

Page 10: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 10

Baysian Definition

• According to Baysian definition of probability, a person’s belief in the truth of a statement may be affected by one or more assumption (hypotheses).

• “I assume it is a fair coin”

• Different people may have different beliefs.

• Can only estimate probability using information we have at hand, though we can modify this estimate later if we get new information.

Page 11: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 11

Conditional probability

• P(S S1) means the probability of ‘statement S’ being true given that we know that another statement, S1, is definitely true.

• If S stands for ‘get heads’ we may at first believe that P(S) = 0.5.• But what if someone tells us that the statement S1: ‘coin is weighted with heavier metal on one side’, is true? • We may change our measurement of probability to P(S S1).

• P(S) is then referred to as the ‘prior’ probability• P(S S1) is the ‘conditional’ or ‘posterior’ probability.

Page 12: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 12

Bayes Theorem

• P(A) is ‘prior’ as it does not take into account any information about B.

• Similarly P(B) is ‘prior’.

• P(A|B) and P(B|A) are conditional or ‘posterior’ probabilities.

• Let A = ‘coin is fair’ & B = ‘getting 12 heads out of 20’

• P(A B) = P(B A) P(A) / P(B)

•Expresses the probability of some fact ‘A’ being true when we know that some other fact ‘B’ is true:

)(

)()()(

BP

APABPBAP

Page 13: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 13

What is prob of getting 12 heads out of 20? clear all; % WITH FAIR COINHIS=zeros(21,1);for rep=1:1000 for n=1:20 R = rand; % Unif random number between 0 & 1 if R > 0.5, Heads(n)=0; else Heads(n)=1; end; end; % of n loopCount = sum(Heads);HIS(1+Count) = HIS(1+Count)+1;end; % of rep loopfigure(1); stem(0:20,HIS);

Page 14: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 14

Histogram for 1000 trials

0 2 4 6 8 10 12 14 16 18 200

20

40

60

80

100

120

140

160

180

200

Number of Heads obtainable with 20 coin-tosses

Fre

quency o

ut

of

1000 t

rials

FAIR COIN

Page 15: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 15

Estimate of probability distribution

0 2 4 6 8 10 12 14 16 18 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Number of Heads obtainable with 20 coin-tosses

Estim

ate

of

pro

b d

istr

ibution b

ased o

n 1

000 t

rials

FAIR COIN

Page 16: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 16

Probability estimate (fair coin)

Estimated probabilities:

for 0:9 heads

0 0 0 0 0.008 0.011 0.024 0.087 0.119 0.160

for 10:19 heads

0.194 0.157 0.115 0.076 0.03 0.012 0.003 0.003 0.001 0

for 20 heads

0

So our estimate of the probability of getting 12 heads out of 20 with a fair coin is 0.115.

Page 17: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 17

What is prob of getting 12 heads out of 20? clear all; %WITH 60-40 WEIGHTED COIN HIS=zeros(21,1);for rep=1:1000 for n=1:20 R = rand; % Unif random number between 0 & 1 if R > 0.4, Heads(n)=1; else Heads(n)=0; end; end; % of n loopCount = sum(Heads);HIS(1+Count) = HIS(1+Count)+1;end; % of rep loopfigure(1); stem(0:20,HIS);

Page 18: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 18

HISTOGRAM for ‘60-40’ weighted coin

0 2 4 6 8 10 12 14 16 18 200

20

40

60

80

100

120

140

160

180

200

Number of Heads obtainable with 20 coin-tosses

Fre

quency o

ut

of

1000 t

rials

Page 19: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 19

Prob distribution estimate for ‘60-40’ weighted coin

0 2 4 6 8 10 12 14 16 18 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Number of Heads obtainable with 20 coin-tosses

Estim

ate

of

pro

b d

istr

ibution b

ased o

n 1

000 t

rials

Page 20: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 20

Estimate Cumulative Prob Distrib

CDF(1)= HIS(1)/1000;for n=2:21, CDF(n)=CDF(n-1)+HIS(n)/1000;end;figure(3); stem(0:20,CDF);

Easily derived from a Histogram or Prob Distribution.

Estimate prob of getting between 0 and n Heads

Page 21: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 21

Estimate of Cumulative Prob Dist

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of Heads obtainable with 20 coin-tosses

Est

imat

e of

cum

ulat

ive

prob

dis

t ba

sed

on 1

000

tria

ls

FAIR COIN

Usually an S shaped function

Page 22: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 22

4 coin-tosses: how many possible outcomes?

0000000100100011010001010110011111111001101010111100110111101111

How many with 0 heads? 1

How many with 1 heads? 4 = 4C1

How many with 2 heads? 6 = 4C2 = 43/ (2!)

How many with 3 heads? 4 = 4C3

How many with 4 heads? 1

Combinations:

nCr = no of ways of choosing r from n

= n(n-1) …(n-r+1) / (r!)

Page 23: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 23

Binomial Prob Distribution

• Distributions have up to now been estimated.

• For random processes with just 2 outputs, we can derive a true distribution:

• If p=prob(Heads), prob of getting Heads exactly r times in n independent coin-tosses is:

nCr pr (1-p)(n-r)

• For a fair coin. p=0.5, this becomes nCr /2n

• For a fair dice, the prob of throwing 3 sixes in five throws is:

[54/(3 2 1)] (1/6)3 (5/6)2

Page 24: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 24

Implementing formula (fair coin)

• p = 0.5; % for fair coin tossing• n=20;• for r=0:n• nCr = prod(n:-1:(n-r+1))/prod(1:r);• P(1+r) = nCr * (p^r) * (1-p)^(n-r);• end;• figure(4); stem(0:20,P);• axis([0 20 0 0.2]); grid on;

Page 25: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 25

True prob distribution (n=20)

0 2 4 6 8 10 12 14 16 18 200

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

No of heads obtainable with n coin-tosses

Tru

e pr

obab

ility

of

gett

ing

that

no

of h

eads Fair coin

Page 26: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 26

True probability from formula

For 0-9 heads:0 0 0.0002 0.0011 0.0046 0.015 0.037 0.074 0.12 0.16For 10-19 heads: 0.176 0.16 0.12 0.074 0.037 0.015 0.0046 0.0011 0.0002 0

For 20 heads:0True prob of getting 12 heads with a fair coin is 0.12.

Changing p to 0.4, we find that the true probability of getting 12 heads out or 20 with a ‘60-40’ weighted coin is: 0.18

Page 27: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 27

Back to Bayes Theorem

• There are 2 coins a fair one & a ‘60-40’ weighted one.

• We chose a coin at random & toss it 20 times.

• What is the probability of having a weighted coin when I get 12 heads out of 20?

• A = ‘coin is weighted 60-40’ & B = ‘get 12 heads out of 20’

• We know that P(B Fair coin) is 0.12 & P(B A) is 0.18.

• So P(B) will be the average of 0.12 & 0.18 = 0.15

• P(A B) = P(B A) P(A) / P(B)

• = 0.18 0.5 /0.15 = 0.6

Page 28: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 28

Further illustration of Bayes Theorem• At a college there are:

10 students from France 5 girls & 5 boys15 from UK 5 girls & 10 boys20 from Canada 5 girls & 15 boys

Page 29: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 29

Calculation• If we choose a student at random, the a-priori probability that this

student is French is P(French) = 10/45 = 2/9 0.22• Now if we notice that this student is a boy, how does this change the

probability that the student is French? • Use Bayes’ Theorem as follows:

= 0.5 (10/45) / (30/45) = 1/6 0.167

• The fact that we notice that the chosen student is a boy gives us additional information that changes the probability that the student chosen at random will be French.

)(

)()()(

BoyP

FrenchPFrenchBoyPBoyFrenchP

Page 30: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 30

Check the calculation

• We can check the previous result by common sense, noticing that out of 30 boys, in the college 5 are from France. Therefore, P(FB) = 5/30 = 1/6.

Page 31: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 31

Usefulness of Bayes Theorem

• In general Bayes’ theorem allows us to take additional information into account when calculating probabilities. Without the additional information, we have a ‘prior’ probability and with it we have a ‘conditional’ or

‘posterior’ probability.

Page 32: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 32

Bayes Theorem in medicine

• A patent goes to a doctor with a bad cough & a fever. The doctor needs to decide whether he has ‘swine flu’.

• Let statement S = ‘has bad cough and fever’ and statement F = ‘has swine flu’.

• The doctor consults his medical books and finds that about 40% of patients with swine-flu have these same symptoms.

• Assuming that, currently, about 1% of the population is suffering from swine-flu and that currently about 5% have bad cough and fever (due to many possible causes including swine-flu), we can apply Bayes theorem to estimate the probability of this particular patient having swine-flu.

Page 33: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 33

Another problem to solve

• A doctor in another country knows form his text-books that for 40% of patients with swine-flu, the statement S, ‘has bad cough and fever’ is true. He sees many patients and comes to believe that the probability that a patient with ‘bad cough and fever’ actually has swine-flu is about 0.1 or 10%. If there were reason to believe that, currently, about 1% of the population have a bad cough and fever, what percentage of the population is likely to be suffering from swine-flu?

Page 34: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 34

Some questions from Lecture 2

• Analyse the ficticious exam results & comment on features.• Compute means, stds & vars for each subject & histograms for the

distributions.• Make observations about performance in each subject & overall• Do marks support the hypothesis that people good at Music are also

good at Maths?• Do they support the hypothesis that people good at English are also

good at French?• Do they support the hypothesis that people good at Art are also good

at Maths?• If you have access to only 50 rows of this data, investigate the same

hypotheses• What conclusions could you draw, and with what degree of certainty?

Page 35: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 35

Continuous random processes• Characterised by probability density functions (pdf)

b

a

abdxxpdf )(

Uniform pdf: Prob of the random variable x lying between a and b is:

pdf(x)

1

1

a b

x

Gaussian (Normal) pdf with mean m & std dev .2

2

1

2

1)(

mx

expdf

m

pdf(x)

a b xm-m+

b

a

dxxpdfob )(Pr

68%95.5% for m 299.7% for m 3

Page 36: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 36

pdf & Histograms• Ru = rand(10000,1); %10000 unif samples

• hist(Ru,20);

• Rg=randn(10000,1); %Gaussian with m=0, std=1

• hist(Rg,20);

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

100

200

300

400

500

600

-4 -3 -2 -1 0 1 2 3 4 50

200

400

600

800

1000

1200

1400

1600

Page 37: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 37

Converting histogram to estimate of pdf

• Divide each column by number of samples

• Then multiply by number of bins.

• For better approximation, increase number of bins

Page 38: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 38

Concept of a ‘null-hypothesis’

• A null-hypothesis is an assumption that is made and then tested by a set of experiments designed to reveal that it is likely to be false, if it is false.

• Testing is done by considering how probable the results are, assuming the null hypothesis is true.

• If the results appear very improbable the researcher may conclude that the null-hypothesis is likely to be false.

• This is usually the outcome the researcher hopes for when he or she is trying to prove that a new technique is likely to have some value.

Page 39: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 39

An example

• Assume we wish to find out if a proposed technique designed to benefit users of a system is likely to have any value.

• Divide the users into two groups and offer the proposed technique to one group and something different to the other group.

• The null-hypothesis would be that the proposed technique offers no measurable advantage over the other techniques.

Page 40: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 40

The testing• This would be carried out by looking for differences between

the sets of results obtained for each of the two groups. • Careful experimental design will try to eliminate differences

not caused by the techniques being compared.• Must take a large number of users in each group & randomize

the way the users are assigned to groups. • Once other differences have been eliminated as far as possible,

any remaining difference will hopefully be indicative of the effectiveness of the techniques being investigated.

• The vital question is whether they are likely to be due to the advantages of the new technique, or the inevitable random variations that arise from the other factors.

• Are the differences statistically significant? • Can employ a statistical significance to find out.

Page 41: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 41

Failure of the experiment

• If the results are not found to look improbable under the null-hypothesis, i.e. if the differences between the two groups are not statistically significant, then no conclusion can be made.

• The null-hypothesis could be true, or it could still be false. • It would be a mistake to conclude that the ‘null-hypothesis’

has been proved likely to be true in this circumstance. • It is quite possible that the results of the experiment give

insufficient evidence to make any conclusions at all.

Page 42: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 42

P-Value

• Probability of obtaining a test result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.

• Reject the null hypothesis if the p-value is less than some value α (significance level) which is often 0.05 or 0.01.

• When the null-hypothesis is rejected, the result is said to be

statistically significant.

Page 43: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 43

Checking whether a coin is fair

Suppose we obtain heads 14 times out of 20 flips.

The p-value for this test result would be the probability of a fair coin landing on heads at least 14 times out of 20 flips. This is:

(20C14 + 20C15+20C16+20C17+20C18+20C19+20C20) / 220 = 0.058

This is probability that a fair coin would give a result as extreme or more extreme than 14 heads out of 20.

Page 44: Scientific Methods 1

16 Nov 2011 COMP80131-SEEDSM2-4 44

Significance test

• Reject null hypothesis if p-value α . • If α= 0.05, the rejection of the null hypothesis is at the 5%

(significance) level.• The probability of wrongly rejecting the null-hypothesis

(Type 1 error) will be equal to α. • This is considered sufficiently low. • In this case, p-value > 0.05, therefore observation is consistent

with null hypothesis and we cannot reject it.• Cannot conclude that coin is likely to be unfair.• But we have NOT proved that coin is likely to be fair.• 14 heads out of 20 flips can be ascribed to chance alone• It falls within the range of what could happen 95% of the time

with a fair coin.