population genetics lab 2

30
opulation Genetics Lab BINOMIAL PROBABILITY & HARDY-WEINBERG EQUILIBRIUM

Upload: tatum-sellers

Post on 30-Dec-2015

18 views

Category:

Documents


0 download

DESCRIPTION

Population Genetics Lab 2. BINOMIAL PROBABILITY & HARDY-WEINBERG EQUILIBRIUM. Last Week : Sample Point Methods: Example: Use the Sample Point Method to find the probability of getting exactly two heads in three tosses of a balanced coin. 1 . The sample space of this experiment is: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Population Genetics Lab 2

Population Genetics Lab 2

BINOMIAL PROBABILITY &

HARDY-WEINBERG EQUILIBRIUM

Page 2: Population Genetics Lab 2

Last Week : Sample Point Methods:Example: Use the Sample Point Method to find the probability of getting exactly two heads in three tosses of a balanced coin.

1. The sample space of this experiment is:

2. Assuming that the coin is fair, each of these 8 outcomes has a probability of 1/8.

3. The probability of getting two heads is the sum of the probabilities of outcomes 2, 3, and 4 (HHT, HTH, and THH), or 1/8 + 1/8 + 1/8 = 3/8 = 0.375.

Outcome Toss 1 Toss 2 Toss 3 Shorthand Probabilities

1 Head Head Head HHH 1/82 Head Head Tail HHT 1/83 Head Tail Head HTH 1/84 Tail Head Head THH 1/85 Tail Tail Head TTH 1/86 Tail Head Tail THT 1/87 Head Tail Tail HTT 1/88 Tail Tail Tail TTT 1/8

Page 3: Population Genetics Lab 2

Sample- point method :

Example: Find the probability of getting exactly 10 heads in 30 tosses

of a balanced coin.

Total # of sample points = 230 = 1,073,741,824

Page 4: Population Genetics Lab 2

Need a way of accounting for all the possibilitiesExample: In drawing 3 M&Ms from an unlimited M&M bowl that is always 60% red and 40% green, what is the P(2 green)?

6.04.036.04.04.03)2(

4.04.06.04.06.04.06.04.04.0)2(

)()()()2(

2

GP

GP

RGGPGRGPGGRPGP

If one green M&M is just as good as another…

6.04.0)2( 232GP

Page 5: Population Genetics Lab 2

Binomial Probability Distribution

Where, n = Total # of trials.

y = Total # of successes.

s = probability of getting success in a single trial.

f = probability of getting failure in a single trial (f = 1-s).

Page 6: Population Genetics Lab 2

1. # of trials are independent, finite, and conducted under

the same conditions.

2. There are only two types of outcome.(Ex. success and

failure).

3. Outcomes are mutually exclusive and independent.

4. Probability of getting a success in a single trial remains

constant throughout all the trials.

5. Probability of getting a failure in a single trial remains

constant throughout all the trials.

6. # of success are finite and a non-negative integer (0,n)

Assumptions of Binomial Distribution

Page 7: Population Genetics Lab 2

Properties of Binomial Distribution

Mean or expected # of successes in n trials, E(y) = ns

Variance of y, V(y) = nsf

Standard deviation of y, σ (y) = (nsf)1/2

Page 8: Population Genetics Lab 2

Example: Find the probability of getting exactly 10 heads in 30

tosses of a balanced coin.Solution:

We know, n = 30y = 10s = 0.5 f = 0.5

Page 9: Population Genetics Lab 2

Example: Find the expected # of heads in 30 tosses of a balanced coin. Also calculate variance.

Solution:

E(Y) = ns = 30*0.5 = 15

V(Y) = nsf = 30*0.5*0.5 = 7.5

Page 10: Population Genetics Lab 2

Problem 1 (10 minutes)(2 points)

An allozyme locus has three alleles, A1,A2, and A3 with frequencies 0.847, 0.133, and 0.020, respectively. If we sample 30 diploid individuals, what is the probability of:

•Not finding any copies of A2?

•Finding at least one copy of A2?

•GRADUATE STUDENTS ONLY: Finding fewer than 2 copies of A2?

Page 11: Population Genetics Lab 2

Example: How many diploid individuals should be sampled to detect at least one copy of allele A2 from Problem 1 with probability of at least 0.95?Solutions:

Thus, to detect at least one copy of allele A2 with probability of 0.95, one would need to sample at least 90 alleles (i.e., at least 45 diploid individuals).

Page 12: Population Genetics Lab 2

Problem 2 (15 minutes)(2 points) Problem 2. The frequency of red-green color-blindness is 0.07 for men and 0.005 for women. You are designing a survey to determine the effect of color blindness on educational success. How many males and females would you have to sample to ensure that the probability including at least one color blind individual of each sex would be 0.90 or greater?

Page 13: Population Genetics Lab 2

Estimation of allele frequency for Co-dominant locus

Where, p = Frequency of allele A1

q = Frequency of Allele A2

N11 = # of individuals with genotype A1A1

N12 = # of individuals with genotype A1A2

N22 = # of individuals with genotype A2A2

N = total # of diploid individuals =N11+N12+N22

Page 14: Population Genetics Lab 2

Estimation of Standard Error

Where, p = Frequency of allele A1

q = Frequency of Allele A2

SEp = Standard error for frequency of allele A1

SEq = Standard error for frequency of allele A2

N = total # of diploid individuals =N11+N12+N22

Page 15: Population Genetics Lab 2

Standard Deviation v. Standard Error

dispersionmean of Measure

dispersion data of Measure

n

VarSE

VarSD

We expect ~68% of the data to fall within 1 standard deviation of the mean.

Page 16: Population Genetics Lab 2

Genotype CountA1A1 17A1A2 23A2A2 10

Example: What are the allele frequencies of alleles A1 and A2, if the following genotypes have been observed in a sample of 50 diploid individuals?

Solution: N11 = 17, N12 = 23, and N22 = 10

Page 17: Population Genetics Lab 2

q = 1 – p = 0.43

Page 18: Population Genetics Lab 2

Estimate the allele frequencies (include their respective standard errors) for alleles A1, A2, and A3 if the following genotypes have been observed in a sample of 200 individuals Genotype Count

A1A1 19

A2A2 17

A3A3 14

A1A2 52

A1A3 57

A2A3 41

Problem 3 (10 minutes) (2 pts)

ijN

NN

p

n

jijii

i

,21

1

Page 19: Population Genetics Lab 2

Problem 4 (Time 10 min.)(2 pts)

Tay Sachs disease is an autosomal recessive genetic disorder causing the death of nerve cells in the brain due to the steady accumulation of gangliosides. Extensive genotyping has determined that approximately 1 in 30 of the 5 million Ashkenazi Jews within the United States is a carrier.

a) Assuming HWE and Mendelian inheritance of the disease, what is the frequency of the recessive allele in this population?

b) What is the SE of this estimate? (Assume 1,000 people were sampled)

c) How many affected children would you expect to be born in this population?

d) What are the assumptions of these estimates?

Page 20: Population Genetics Lab 2

Hypothesis TestingHypothesis: Tentative statement for a scientific problem, that can be tested by further investigations. 1.Null Hypothesis(Ho): There is no significant difference in

observed and expected values.

2.Alternate Hypothesis(H1): There is a significant

difference in observed and expected values.

Example:

Ho = Fertilized and unfertilized crops have equal yields

H1 = Fertilized and unfertilized crops do not have equal yields

Page 21: Population Genetics Lab 2

Remember: In final conclusion after the experiment ,we either –

"Reject H0 in favor of H1"

Or

“Fail to reject H0”,

Page 22: Population Genetics Lab 2

Type I error: Error due to rejection of a null hypothesis, when it

is actually true (False positive).

Level of significance(LOS) (α) : Maximum probability

allowed for committing “type I error”.

At 5 % LOS (α=0.05), we accept that if we were to repeat the experiment many times, we would falsely reject the null hypothesis 5% of the time.

Page 23: Population Genetics Lab 2

P- value:

Probability of committing type I error

If P-value is smaller than a particular

value of α, then result is significant at

that level of significance

Page 24: Population Genetics Lab 2

Testing departure from HWEIn a randomly mating population, allele and genotype frequencies remain constant from

generation to generation.

Ho= There is no significant difference between observed and expected genotype frequencies (i.e. Population is in HWE)

H1= There is a significant difference between observed and expected genotype frequencies (i.e. Population is not in HWE)

Page 25: Population Genetics Lab 2

HWE Assumptions

1. Random mating2. No selection

a. Equal numbers of offspring per parentb. All progeny equally fit

3. No mutation4. Single, very large population5. No migration

Page 26: Population Genetics Lab 2

χ2 - test

Where,

1-estimated parameters# -

genotypes ofNumber

genotype ofcount Expected

genotype ofcount Observed

kdf

k

iE

iO

i

i

2,

20 if HReject df

Page 27: Population Genetics Lab 2

Example: A population of Mountain Laurel at Cooper’s Rock State Forest has the following observed genotype counts:

Genotype Observed number

A1A1 5000

A1A2 3000

A2A2 2000

Is this population in Hardy-Weinberg equilibrium ?

Page 28: Population Genetics Lab 2

Genotype Expected frequency under HWE

Expected number under HWE

A1A1 p2 = 0.652 = 0.4225 0.4225 10000 = 4225

A1A2 2pq = 0.455 0.455 10000 = 4550

A2A2 q2 = 0.1225 0.1225 10000 = 1225

Genotype Obs. #(O) Exp. #(E) (O-E) (O-E)^2 (O-E)^2/EA1A1 5000 4225 775 600625 142.1598A1A2 3000 4550 -1550 2402500 528.022A2A2 2000 1225 775 600625 490.3061

χ2 1160.488

Page 29: Population Genetics Lab 2

The critical value (Table value) of χ2 at 1 df and at α=0.05 is approx. 3.84.

Conclusion: Because the calculated value of χ2 (1160.49) is greater than the critical value (3.84), we reject the null hypothesis and accept the alternative (Not in HWE).

1113

)1 (i.e. pon

dependent isit becauseparameter estimatedan as qcount not do We

.genotypes) (3 data thefrom (p)parameter 1 estimated We

df

pq

Page 30: Population Genetics Lab 2

Problem 5 (Time 10 min) (2 pts)Based on the observed genotype counts in problem 3, test whether the population that had been sampled is in HWE. What are some possible explanations for the observed results?

Genotype Count

A1A1 19

A2A2 17

A3A3 14

A1A2 52

A1A3 57

A2A3 41