25 fin

Post on 13-May-2015

692 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Hadley Wickham

Stat310Fin

Saturday, 24 April 2010

To those of you who bought your textbooks from my amazon link.

To the textbook publishers who generously sent me free copies of books.

To Kensey for suggesting chik-fil-a

Thank you!

Saturday, 24 April 2010

1. Eat!

2. Final & help sessions

3. Finish off hypothesis testing

4. Other statistics opportunities

5. Feedback (TA & me)

Saturday, 24 April 2010

Final

Saturday, 24 April 2010

FinalTake home. Two hours long. Three (double-sided) pages of notes.

Available Wednesday April 28 9am.Due Wednesday May 5, 5pm, under my door.

Ten small questions of approximately equal weight. Similar to questions from the homework/book.

Saturday, 24 April 2010

Common themes

Probability of an event.

Independence & conditioning.

Distributions: pdf/pmf, cdf, mgf, named.

Transformations.

Sampling distribution of mean and variance.

Estimation and testing.

Philosophy of gradingSaturday, 24 April 2010

Mon, Tue, Wed, Thurs, Fri, Sat, Sun?

Morning or afternoon?

One-on-one help, plus brief revision of topics of particular interest. Suggest and vote at http://goo.gl/mod/joIx

Help sessions

Saturday, 24 April 2010

Honour code

Remember to pledge your exam, and note the time at which you started and ended.

You may refer only to your note sheets, not to the text book or old homeworks etc.

Saturday, 24 April 2010

Hypothesis testing

Saturday, 24 April 2010

Course grades

Assume I took a random sample of 20 students from each years, and that course grades are normally distributed by variance 80.

What is the distribution of difference of the two group means?

Saturday, 24 April 2010

Your turn

The average grade from 2009 was 85 and the average grade from 2010 was 90.

What is the p-value? (The probability that you’d see a difference this large or large if there really was no difference in the population means)

Saturday, 24 April 2010

1. Write down Ho and Ha (positions of defence and prosecution)

2. Figure out good test statistic (what numeric summary?)

3. Work out null distribution (distribution of innocents)

4. Calculate p-value by comparing actual value to null distribution (what proportion of true innocents look more guilty than the suspect)

5. Reject Ho if p-value smaller than cutoff

Saturday, 24 April 2010

Say is guilty

Say is innocent

Is guilty

Is innocent

CorrectFalse

acquittal

False conviction

Correct

Saturday, 24 April 2010

Your turn

Which type of error is more expensive/more costly/worse in the criminal justice system?

Saturday, 24 April 2010

Reject HO Accept HO

HO false

HO true

CorrectType II error

Type I error

Correct

Saturday, 24 April 2010

For a given test,

P(false conviction) = α = significance level

P(false acquittal) = 1 - ββ = power

What do think happens to β if you try to make α smaller?

Rates

Saturday, 24 April 2010

α↑ β↓α↓ β↑

Saturday, 24 April 2010

Cut off

Choose cut-off based on rate of false convictions.

If you want a 5% rate of false convictions, reject Ho if the p-value is less than 0.05. (This is the industry standard rate)

Can work out power.

Saturday, 24 April 2010

76

78

80

82

84

86

88

90

xx

x

x

xxxxx

xx

x

x

xxxxxx

xxx

x

x

x

x

xx

x

x

x

x

x

x

xxx

x

xxx

x

xxx

xx

xxxx

xx

x

xx

x

xx

xx

x

x

xxxx

x

x

x

x

xxxxxx

x

x

xx

x

x

x

xx

x

xx

xx

xx

x

x

xxxxx

y

yy

yyy

y

y

yyyyy

yy

yy

yy

y

yy

yy

yy

y

yyyy

y

y

y

yy

y

y

y

y

y

y

y

yy

yyy

yyy

y

y

yy

y

y

y

y

yyy

y

y

y

yy

yy

y

y

y

yyyyy

y

yy

y

y

y

yyy

y

y

y

yyy

y

y

yyy

yyy

20 40 60 80 100

μx=80, μy=85

Saturday, 24 April 2010

Difference

−2

0

2

4

6

8

10

20 40 60 80 100

μx=80, μy=85

Saturday, 24 April 2010

|Difference|

0

2

4

6

8

10

20 40 60 80 100

μx=80, μy=85

Saturday, 24 April 2010

z−score

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

20 40 60 80 100

μx=80, μy=85

Saturday, 24 April 2010

p−value

0.0

0.2

0.4

0.6

0.8

20 40 60 80 100

μx=80, μy=85

Correctly reject null 39% of the time

Saturday, 24 April 2010

76

78

80

82

84

x

xxxx

x

x

x

x

x

x

x

x

xx

x

x

x

xxxx

x

xx

x

x

x

x

x

x

xxxx

xxx

xx

xxx

x

x

xxx

x

x

x

xxx

x

x

x

x

xxxx

x

x

x

x

xx

xxx

xxx

x

xx

x

x

x

x

x

x

x

x

x

x

xxx

x

xx

x

x

xx

x

x

xyy

yy

y

y

y

y

y

yy

yy

y

yyy

y

y

y

y

y

y

y

yy

yy

y

y

y

y

y

y

yy

y

yyy

y

yy

y

y

y

yyy

yy

yyy

yyyy

yyy

yy

y

y

yy

yy

y

y

y

y

y

y

y

y

yy

yyyy

y

y

y

yyyy

y

y

y

yyy

y

y

y

y

20 40 60 80 100

μx=μy=80

Saturday, 24 April 2010

difference

−5

0

5

20 40 60 80 100

μx=μy=80

Saturday, 24 April 2010

z−score

0.0

0.5

1.0

1.5

2.0

2.5

3.0

20 40 60 80 100

μx=μy=80

Saturday, 24 April 2010

|difference|

0

2

4

6

8

20 40 60 80 100

μx=μy=80

Saturday, 24 April 2010

p−value

0.0

0.2

0.4

0.6

0.8

20 40 60 80 100

μx=μy=80

Incorrectly reject null 6% of the time

Saturday, 24 April 2010

Your turn

The average grade from 2009 was 85 and the average grade from 2010 was 90. Would you reject the null hypothesis that the average grade was the same?

Saturday, 24 April 2010

Connection to confidence intervals

If you construct a 90% confidence interval, and it doesn’t include the parameter until the null, then the p-value must be > 1 - 0.9 = 0.1.

If the p-value is 0.08, then a 92% or greater confidence interval would include the null parameter, and a smaller confidence interval would not.

Saturday, 24 April 2010

Statistics

Saturday, 24 April 2010

Majoring3 required stat classes (Stat310, Stat405, Stat410) + 6 stat electives + calc, linear algebra, computing+ design project

Makes for a great double major. Particularly useful if you’re thinking about grad school. (Appealing to employers too)

http://statistics.rice.edu/ShowInterior.aspx?id=58

Saturday, 24 April 2010

Minoring

From next year

Three required:Track A: stat310, stat405, stat400/410Track B: stat100, stat280, stat385

Three elective:300 level+, one outside stat if it has strong statistical component

Saturday, 24 April 2010

Stat410

Introduction to linear models

Powerful and general statistical tool.

Theory and data.

Offered in Fall.

Saturday, 24 April 2010

Stat405

Project based introduction to data analysis. Lots of computing and hardly any maths.

http://had.co.nz/stat405

Offered in Fall, and next year in Spring.

Saturday, 24 April 2010

ElectivesSOCI 436 (Houston area survey), 313 (demography)

ECON 340/440 (game theory), 400 (econometrics), 475 (optimisation), 477 (math of economics), 479 (modelling)

STAT 385, 431 (more theory), 420 (process control), 421 (time series), 422 (Bayesian data analysis), 423 (bioinformatics), 453 (biostatistics), 485 (environmental)

Saturday, 24 April 2010

One form for me.

One form Xin Zhao, who most of you never met but was the TA in charge of your grading.

No form for Garrett.

Feedback

Saturday, 24 April 2010

top related