statistical comparison of two or more systems

29
Statistical Comparison of Two or More Systems The most relevant of all the Basic Theory Lectures. No Holidays.

Upload: johnna

Post on 06-Jan-2016

33 views

Category:

Documents


2 download

DESCRIPTION

Statistical Comparison of Two or More Systems. The most relevant of all the Basic Theory Lectures. No Holidays. THE MISSION. Your analysis task involves manipulating conditions of the system of interest from a prescribed set of options. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistical Comparison of Two or More Systems

Statistical Comparison of Two or More Systems

The most relevant of all the Basic Theory Lectures.

No Holidays.

Page 2: Statistical Comparison of Two or More Systems

THE MISSION

Your analysis task involves manipulating conditions of the system of interest from a prescribed set of options. Design of Experiments: Determine if the

different options are really different. Is the best one really statistically better?

Ranking and Selection: What’s the probability that the best sample indicates the best system setting?

Page 3: Statistical Comparison of Two or More Systems

VOCABULARY

Factor An element of the system that will be

manipulated Setting or Level

A value that a Factor may assume

Page 4: Statistical Comparison of Two or More Systems

EXAMPLE : Simulation model of Football (EA Sports)

Factors Quarterback Running Back Strong Safety

Settings or Levels for Quarterback Dante’ Bret Johnny U.

Page 5: Statistical Comparison of Two or More Systems

TYPES OF DESIGNS One Factor, Two Settings

Paired samples Behrens-Fischer Question: Which is Best?

More than one Factor Factorial Designs Partially Exhaustive Designs Question: Are the settings significant

difference-makers?

Page 6: Statistical Comparison of Two or More Systems

PAIRED SAMPLES Example: Quarterback Controversy! Simulate St. Louis Rams vs. Tampa Bay

Bucs, recording the Quarterback Rating Level 1: Curt Warner Level 2: Mark Bulger

Run the simulation 28 times for each player, resulting in data set W1, W2, ..., W28 B1, B2, ..., B28

Is E[B] > E[W]?

Page 7: Statistical Comparison of Two or More Systems

BRUTE FORCE

Confidence interval on the quantity E[W]-E[B]

If it doesn’t include 0.0, we have conclusive evidence that there is a difference

Equivalent to the Hypothesis Test H0: E[B]=E[W]

Page 8: Statistical Comparison of Two or More Systems

CALCULATIONS ON VARIANCES: SOME BASICS Let X and Y be random variables

)(])[(

])[(][

2

2

xdFXEX

XEXEXVAR

X

Page 9: Statistical Comparison of Two or More Systems

CALCULATIONS ON VARIANCES: SOME BASICS Let X and Y be random variables

],[2][][][)5

][][)4

][][][],[)3

],[2][][][)2

])[(][][)1

2

22

YXCOVYVARXVARYXVAR

XVARccXVAR

YEXEXYEYXCOV

YXCOVYVARXVARYXVAR

XEXEXVAR

COV=0 if X and Y are independent.

Page 10: Statistical Comparison of Two or More Systems

SAMPLE MEAN

n

XVARXVAR

n

n

XVARnn

XVARXVAR

i

n

ii

n

ii

)(

1)(

2

12

1

nX

X

Page 11: Statistical Comparison of Two or More Systems

CONFIDENCE INTERVAL

/2 probability of Type I error on each end of the confidence interval

basic interval for X-bar is

nZX

nZX

XVARZX

2/

2

2/

2/ ][

Page 12: Statistical Comparison of Two or More Systems

BASIC CONFIDENCE INTERVAL

][)( 2/ BWVARZBW

28

0][][

],[2][][][

BVARWVAR

BWCOVBVARWVARBWVAR

Page 13: Statistical Comparison of Two or More Systems

SPREADSHEET HIGHLIGHTS 1 (U-0.5)*SQRT(12)

zero mean unit stddev

+ (U-0.5)*SQRT(12)* mean stddev uniform over an interval centered at

and *SQRT(12)/2 wide

Page 14: Statistical Comparison of Two or More Systems

COMMON RANDOM NUMBERS Correlation is not always BAD! Suppose we could INDUCE

CORRELATION between the W’s and the B’s without adding any bias?

Reduces the theoretical variance of W-bar – B-bar

FREE POWER (the probability of correctly rejecting H0: equal means)

Page 15: Statistical Comparison of Two or More Systems

STREAMING

Segregate the random number generation task into streams connected to phenomena

seed1 seed2

Inter-arrivaltimes

Servicetimes

Zi=aZi-1 mod m

1. Change features of the service.2. Use exact same arrival stream forcomparing each service setting.

Page 16: Statistical Comparison of Two or More Systems

SPREADSHEET HIGHLIGHTS 2

Use same results of RAND() for building Bulger samples Warner samples

Note CI shrinkage Try with identical sigma Discuss “Estimation”

Page 17: Statistical Comparison of Two or More Systems

Behrens-Fischer Problem Comparison of Means No pairs, equal sample sizes, or equal variances Remember that we are after the variance of W-

bar – B-bar Common use: New samples vs. History

0/][/][

],[2][][][

BW nBVARnWVAR

BWCOVBVARWVARBWVAR

Page 18: Statistical Comparison of Two or More Systems

SPREADSHEET HIGHLIGHTS

Page 19: Statistical Comparison of Two or More Systems

MULTI-SETTING CASE

Can involve many Factors or just one

Treatment i has mean i

Analysis of Variance (ANOVA) Data from treatment 1, 2, ..., n H0: 1 =...n-1 =n

Are the treatments distinguishable?

Page 20: Statistical Comparison of Two or More Systems

DESIGN OF EXPERIMENT

DetermineFactors and Settings

Collect DataAccording to Design

Design = Which Factors,Which Settings for each Treatment

PerformANOVA

State Conclusion

Page 21: Statistical Comparison of Two or More Systems

FULL FACTORIAL

Build sample of All Combinations Factors

Quarterback (2) Running Back (3) Strong Safety (3) 2x3x3=18 Treatments

Page 22: Statistical Comparison of Two or More Systems

HOW ANOVA WORKS Xi,j is ith sample from jth treatment point Assumed iid Normal (never!) Decomposition of variability

Observation (Obs) Treatment vs. Grand Mean (Tr) Within Treatment (Res)

jiiji eX ,,

Page 23: Statistical Comparison of Two or More Systems

HYPOTHESIS H0

The treatment variability is random variability

The size of the treatment variability is in-scale with the residual variability

ANOVA uses sums of squares g treatments nt samples from treatment t

Page 24: Statistical Comparison of Two or More Systems

ANOVA TABLE

1)(

)(

1)(

11

2,

1

11

2,

1Re

1

2

g

it

g

tji

n

jObs

g

it

g

ttji

n

js

g

tttTr

nxxSS

gnxxSS

gxxnSS

t

t

degreesfreedom

Page 25: Statistical Comparison of Two or More Systems

REMEMBER chi-SQUARED?From our Goodness-of-Fit Test

X~N(0,1) for n independent X’s sum of n X2 is chi-SQUARED with n

degrees of freedom if estimates (X-bar, sigma) were

used to make X’s N(0,1), lose one d.f. per estimate

Page 26: Statistical Comparison of Two or More Systems

F-distribution X is chi-sq with n d.f. Y is chi-sq with m d.f. (X/n)/(Y/m) has F distribution

Page 27: Statistical Comparison of Two or More Systems

ANOVA HYPOTHESIS TEST

FfdSS

fdSS

s

Tr ~./

./

Re

The normalizing cancels!

Page 28: Statistical Comparison of Two or More Systems

ANOVA HYPOTHESIS TEST Compare the

test statistic to a table

Reject if its big and conclude that ...

the Treatments are Different!

Page 29: Statistical Comparison of Two or More Systems

SPREADSHEET HIGHLIGHTS