introduction to bag of little bootstrap
DESCRIPTION
Reading group presentation on Bag of Little Bootstrap (BLB)TRANSCRIPT
ML-IR Discussion:Bag of Little Bootstrap (BLB)
Recap:
- Recap- Why bootstrap- What is bootstrap
- Bag of Little Bootstrap (BLB)- Guarantees- Examples
Recap:
Population
Our Sample
Estimate the median!
Estimate the median!
Asymptotic Approach
Theory has it:
Asymptotic Approach
Theory has it:
?
Asymptotic Approach
95% Confidence Interval
Problems with the asymptotic Approach:
- Density “f” is hard to estimate- Sample size demand is much larger than the mean for
Central Limit theorem to kick in- True median unknown
Solution:When theory is too hard…Let’s empirically estimate theoretical truth!
Empirical Approach: Ideal
Population
Sample Over and Over again!
Empirical Approach: Ideal
Population
Median Est 1 Median Est 2
Sample Over and Over again!
Empirical Approach: Ideal
Empirical Approach: Ideal
95% of sample medians
SimilarEnough?
Population
Our Sample
Empirical Approach: BootstrapEfron Tibshirani (1993)
Our Sample
Draw with replacement n samples
Median Est* 1 Median Est* 2
Empirical Approach: Bootstrap
Empirical Approach: Bootstrap95% of sample medians
Empirical Approach: Bootstrap
Used for:- Bias estimation- Variance- Confidence intervals
Main benefits:- Automatic- Flexible- Fast convergence (Hall, 1992)
Key: There are 3 distributions
Population
Key: There are 3 distributions
Approximatedistribution
Population
Actual Sample
Key: There are 3 distributions
Approximatedistribution
Approximatedistribution
Population
Actual Sample
Bootstrap Samples
Key: There are 3 distributions
Approximatedistribution
Approximatedistribution
Approximatethe approximation- Is there bias?- What’s the variance?- etc.
Population
Actual Sample
Bootstrap Samples
No free meals:
- Bootstrapping requires re-sampling the entire population B times
- Each sample is size n- Sampling m < n will violate the sample size
properties- Original sample size cannot be too small
- “Pre-asymptopia” cases
Hope- Resample expects .632n unique samples- Sample less – m out of n bootstrap is possible with
analytical adjustments. (Bickel 1997)
Hope- Resample expects .632n unique samples- Sample less – m out of n bootstrap is possible with
analytical adjustments. (Bickel 1997)
Intuition: Need less than all n values for each bootstrap.
Hope- Resample expects .632n unique samples- Sample less – m out of n bootstrap is possible with
analytical adjustments. (Bickel 1997)
Intuition: Need less than all n values for each bootstrap.
Problem:- Analytical adjustment is not as automatic as desirable- m out of n bootstrap is sensitive to choices of m
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b- Resample each
until sample size is n, r times.
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b- Resample each
until sample size is n, r times.
- Compute the median for each
Med 1 Med r
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b- Resample each
until sample size is n, r times.
- Compute the median for each
- Compute the confidence interval for each
Med 1 Med r
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b- Resample each
until sample size is n, r times.
- Compute the median for each
- Compute the confidence interval for each
Med 1 Med r
Bag of Little Bootstrap- Sample without
replacement the sample s times into sizes of b- Resample each
until sample size is n, r times.
- Compute the median for each
- Compute the confidence interval for each
- Take average of each upper and lower point for the confidence interval
Med 1 Med r
Bag of Little BootstrapKlein et al. 2012
Computational Gains:- Each sample only has b unique values!
- Can sample a b-dimensional multinomial with n trials.
- Scales in b instead of n- Easily parallelizable
Bag of Little BootstrapKlein et al. 2012
Computational Gains:- Each sample only has b unique values!
- Can sample a b-dimensional multinomial with n trials.
- Scales in b instead of n- Easily parallelizable
If b=n^(0.6), a dataset of size 1TB:- Bootstrap storage demands ~ 632GB- BLB storage demands ~ 4GB
Bag of Little Bootstrap
Theoretical guarantees:- Consistency- Higher order correctness- Fast convergence rate (same as bootstrap)
Performanceb = n^(gamma), 0.5<= gamma <=1These choices of gamma ensures bootstrap convergence rates.
Performanceb = n^(gamma), 0.5<= gamma <=1These choices of gamma ensures bootstrap convergence rates.
Relative error of confidence interval width of logistic regressioncoefficients(Klein et al. 2012)
Performanceb = n^(gamma), 0.5<= gamma <=1These choices of gamma ensures bootstrap convergence rates.
Relative error of confidence interval width of logistic regressioncoefficients(Klein et al. 2012)
Gamma residuals t-distr residuals
Performance vs Time
Selecting Hyperparameters• b, the number of unique samples for each little bootstrap• s, the number of size b samples w/o replacement• r, the number of multinomials to draw
Selecting Hyperparameters• b, the number of unique samples for each little bootstrap• s, the number of size b samples w/o replacement• r, the number of multinomials to draw
b: the larger the betters, r: adaptively increase this until a convergence has been reached. (Median doesn’t change)
Bag of Little Bootstrap
Main benefits:- Computationally friendly- Maintains most statistical properties of bootstrap- Flexibility- More robust to choice of b than older methods
Reference• Efron, Tibshirani (1993) An Introduction to the Bootstrap• Kleiner et al. (2012) A Scalable Bootstrap for Massive Data
Thanks!