bootstrap t test

Upload: pinkgolem

Post on 04-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Bootstrap t Test

    1/19

    Bootstrap

    Bootstrapping applied to t-tests

  • 8/13/2019 Bootstrap t Test

    2/19

    Problems with t

    Wilcox notes that when we sample from a non-normalpopulation, assuming normality of the sampling distributionmaybe optimistic without large samples

    Furthermore, outliers have an influence on both the mean and sd

    used to calculate t  Actually has a larger effect on variance, increasing type II error

    due to std error increasing more so than the mean

    This is not to say we throw the t-distribution out the window

    If we meet our assumptions and have ‘pretty’ data, it is

    appropriate However, if we cannot meet the normality assumption we may

    have to try a different approach

    E.g. bootstrapping

  • 8/13/2019 Bootstrap t Test

    3/19

    More issues with the t-test

    In the two-sample case we have an additional

    assumption (along with normality and independent

    observations)

    We assume that there are equal variances in thegroups

    Recall our homoscedasticity discussion

    Often this assumption is untenable, and the results,

    like other violations result in using calculatedprobabilities that are inaccurate

    Can use a correction, e.g. Welch’s t

  • 8/13/2019 Bootstrap t Test

    4/19

    More issues with the t-test

    It is one thing to say that they are unequal, but what might thatmean?

    Consider a control and treatment group, treatment groupvariance is significantly greater 

    While we can do a correction, the unequal variances maysuggest that those in the treatment group vary widely in how theyrespond to the treatment

     Another reason for heterogeneity of variance may be related toan unreliable measure being used

    No version of the t-test takes either into consideration Other techniques, assuming enough information has been

    gathered, may be more appropriate (e.g. hierarchical), and morereliable measures may be attainable

    *Note that, if those in the treatment are truly more variable, a more reliable measure would actually detectthis more so (i.e. more reliability would lead to a less powerful test). We will consider this more later.

  • 8/13/2019 Bootstrap t Test

    5/19

    The good and the bad

    regarding t-tests

    The good

    If assumptions are met, t-test is fine

    When assumptions aren’t met, t-test may still be robust withregard to type I error in some situations

    With equal n and normal populations HoV violations won’tincrease type I much

    With non-normal distributions with equal variances, type I errorrate is maintained also

    The bad

    Even small departures from the assumptions result in powertaking a noticeable hit (type II error is not maintained)

    t-statistic, CIs will be biased

  • 8/13/2019 Bootstrap t Test

    6/19

    Bootstrap

    Recall the notion of a sampling distribution

    We never have the population available inpractice, so we take a sample (one of an

    infinite amount of possible ones) The sampling distribution is a theoretical  

    distribution whose shape we assume

  • 8/13/2019 Bootstrap t Test

    7/19

    Bootstrap

    The basic idea involves sampling with replacementfrom the sample data (essentially treating it as thepopulation) to produce random samples of size n We create an empirical  sampling distribution

    Each of these samples provides an estimate of theparameter of interest

    Repeating the sampling a large number of timesprovides information on the variability of theestimator 

  • 8/13/2019 Bootstrap t Test

    8/19

    Bootstrap

    Hypothetical situation: If we cannot assume normality, how would we go about getting a

    confidence interval? Wilcox suggests that assuming normality via the central limit theorem

    doesn’t hold for small samples, and sometimes could require as

    much as 200 to maintain type I error if the population is not normallydistributed

    If we do not maintain type I error, confidence intervals and inferencesbased on them will be suspect

    How might you get a confidence interval for something besides amean?

    Solution: Resample (with replacement) from our own data based on its

    distribution

    Treat our sample as a population distribution and take randomsamples from it

  • 8/13/2019 Bootstrap t Test

    9/19

    The percentile bootstrap

    We will start by considering a mean

    We can bootstrap many sample means

    based on the original data

    One method would be to simply create this

    distribution of means, and note the

    percentiles associated with certain values

  • 8/13/2019 Bootstrap t Test

    10/19

    The percentile bootstrap

    Here are some values(from Wilcox text), mentalhealth ratings of collegestudents Mean = 18.6

    Bootstrap mean (k=1000) = 18.52

    The bootstrapped 95% CIis 13.85, 23.10

     Assuming normality 13.39, 23.81

    Different coverage (non-symmetric for bootstrap),and the classical approachis noticeably wider 

    2,4,6,6,7,11,13,13,14,15,19,23,24,27,28,28,28,30,31,43

  • 8/13/2019 Bootstrap t Test

    11/19

    The percentile t bootstrap

     Another approach would be to create an empirical t

    distribution

    Recall the formula for a one-sample t

    For our purposes here, we will calculate a t, 1000

    times, as follows. With each mean and standarddeviation of 1 of those 1000 samples, calculate

    / X t  sn 

    **/ XX t  sn

  • 8/13/2019 Bootstrap t Test

    12/19

    The percentile t bootstrap

    This would give us a t distribution with 1000 t

    scores

    What we would now do for a confidence

    interval is find the exact t corresponding to

    the appropriate quantiles (e.g. .025,.975),

    and use those to calculate a CI using the

    original sample statistics**,UL ss XTXT nn

  • 8/13/2019 Bootstrap t Test

    13/19

    Confidence Intervals

    So what we have done is, instead of

    assuming some sampling distribution of a

    particular shape and size, we’ve created it

    ourselves and derived our interval estimatefrom it

    Simulations have shown that this approach is

    preferable for maintaining type I error withlarger samples in which the normality

    assumption may be untenable.

  • 8/13/2019 Bootstrap t Test

    14/19

    Independent Groups

    Comparing independent groups

    Step 1 compute the bootstrap mean and

    bootstrap sd as before, but for each group

    Each time you do so, calculate T*

    This again creates your own t distribution.

    ***1212**1212()() XXXX T  ssnn

  • 8/13/2019 Bootstrap t Test

    15/19

    Hypothesis Testing

    Use the quantile points corresponding to yourconfidence level in computing your confidenceinterval on the difference between means, ratherthan the tcv from typical distributions

    Note however that your T* will not be the same forthe upper and lower bounds

    Unless your bootstrap distribution was perfectlysymmetrical

    Not likely to happen, so…

    12*12() XX  XXTs

    1212**1212(),()UL XXXX  XXTsXXTs

  • 8/13/2019 Bootstrap t Test

    16/19

    Hypothesis Testing

    One can obtain ‘symmetric’ intervals

    Instead of using the value obtained in the numerator(mean-mu) or (diff b/t means – mu1-mu2), use its

    absolute value

    Then apply the standard + formula This may in fact be the best approach for most

    situations

    ** XX  se

    *() X  XTs

  • 8/13/2019 Bootstrap t Test

    17/19

    Extension

    We can incorporate robust measures of location rather thanmeans Eg. Trimmed means

    With a program like R it is very easy to do both bootstrappingand with robust measures using Wilcox’s libraries http://psychology.usc.edu/faculty_homepage.php?id=43  Put the Rallfun files (most recent) in your version 2.x main folder

    and ‘source’ them, then you’re ready to start using suchfunctionality E.g. source(“Rallfunv1.v5”)

    Example code on last slide

    The general approach can also be extended to more than 2groups, correlation, and regression

    http://psychology.usc.edu/faculty_homepage.php?id=43

  • 8/13/2019 Bootstrap t Test

    18/19

    So why use?

     Accuracy and control of type I error rate  As opposed to just assuming that it’ll be ok

    Most of the problems associated with both accuracyand maintenance of type I error rate are reduced

    using bootstrap methods compared to Student’s t Wilcox goes further to suggest that there may be in

    fact very few situations, if any, in which thetraditional approach offers any advantage over thebootstrap approach

    The problem of outliers and the basic statisticalproperties of means and variances as remainhowever 

  • 8/13/2019 Bootstrap t Test

    19/19

    Example independent samples

    t-test in R

    source("Rallfunv1.v5")

    source("Rallfunv2.v5")

    y=c(1,1,2,2,3,3,4,4,5,7,9)

    z=c(1,3,2,3,4,4,5,5,7,10,22)

    t.test(y,z, alpha=.05) yuenbt(y,z,tr=.0,alpha=.05,nboot=600,side=T)