tuning tie-breaker experiments 1 tuning the tie...

43
Tuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen * , Stanford University and Hal Varian, Google * Work (mostly) done for Google, not as part of my Stanford responsibilities. Stanford statistics seminar

Upload: others

Post on 16-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 1

Tuning the tie-breaker design

Art B. Owen∗, Stanford University

and

Hal Varian, Google

∗Work (mostly) done for Google, not as part of my Stanford responsibilities.

Stanford statistics seminar

Page 2: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 2

OverviewLots of problems come up when you combine data of different types.

From 40,000 feet

• Bayes

• Likelihood

• Empirical Bayes

• Transportability

At ground level

Specifics are interesting.

First I sketch some recent examples.

Then the work with Hal Varian.

Stanford statistics seminar

Page 3: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 3

Big data & small dataWith Aiyou Chen and Minghui Shi of Google.

Small and good data set

(xi, yi), i ∈ S, n obs

We want β for this small population.

Huge data set, possibly relevant

(xi, yi), i ∈ B, N � n obs

Approach

Shrink β̂S towards β̂BStein or Bayes

Stanford statistics seminar

Page 4: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 4

Related GWASWith Edgar Dobriban, Stuart Kim, Kristen Fortney

• Tiny underpowered GWAS on centenarians

• Seek optimal weighting of the SNP hypotheses (inverse weight the p-values)

• Using huge GWAS on age-related illness (eg diabetes, hypertension)

We got some new longevity-associated genes.

Stanford statistics seminar

Page 5: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 5

Partial conjunction testsConcept from Benjamini & Heller

Test same H0 n data sets. Require at least r rejections.

Better reproducibility than meta-analysis.

2 papers lead by Jingshu Wang

Paper 1

Conditions for admissible testing of a weirdly composite “sparsity null”.

Wang & O (2018) JASA

Paper 2

N genes in n studies

An N × n matrix of p-values

Filtering idea to do N PC tests at once.

Wang, Su, Sabatti, O (2018)

Stanford statistics seminar

Page 6: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 6

Propensity workWith Evan Rosenman and Michael Baiocchi and Hailey Banack (2018)

Does W ∈ {0, 1} cause y?

Huge data base (Wi, xi, yi) for i ∈ Obs.

Wi chosen in a way that could depend on xi

Small randomized experiment (Wi, xi, yi) for i ∈ Expt.

Wi chosen at random

First idea

Put experimental subjects into a propensity bucket.

The one they would have occupied in the observational data.

Women’s health initiative

Both kinds of data on hormone replacement vs coronary heart disease.

Stanford statistics seminar

Page 7: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 7

Hal Varian

Google chief economist

Stanford statistics seminar

Page 8: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 8

Customer loyalty plansAn airline can give an upgrade to n out of N customers. Who?

• The n most loyal customers?

• The n customers most likely to start flying / spending more?

Other examples

• Hotels & car rental companies

• E-commerce platforms, for their advertisers, reviewers, or content producers

Stanford statistics seminar

Page 9: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 9

Two goals

1) Get the most value from the offer

2) Measure the causal effect of the offer

Stanford statistics seminar

Page 10: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 10

Two acronyms

1) RDD = Regression Discontinuity Design

2) RCT = Randomized Controlled Trial

We will hybridize between these approaches.

Stanford statistics seminar

Page 11: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 11

The random variablesi customer id

zi treatment, YES = 1, NO = −1

yi outcome, e.g., revenue one year later (or profit, or · · · )xi assignment variable (larger the better)

Assignment / running variable x

1) It could be past revenue, or

2) a machine learning prediction.

Stanford statistics seminar

Page 12: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 12

Some simplificationsSuppose at first that half of zi = 1 and half are−1.

(undo later)

Rank transformation

Sort customers, x1 6 x2 6 · · · 6 xN , then

re-define xi ←2i−N − 1

N

Now−1 < xi < 1.

Two-line regression

yi = β0 + β1xi + β2zi + β3xizi + εi εi ∼ (0, σ2)

Other models are interesting, but we need to pick one, so this is it.

Stanford statistics seminar

Page 13: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 13

Regression discontinuityTreatment IFF x > 0 Thistlethwaite & Campbell (1960)

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

89

1011

1213

Running variable

Res

pons

e

People just left of the discontinuity should be comparable to those just right of it.Stanford statistics seminar

Page 14: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 14

Separate linear regressions

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

89

1011

1213

Regression discontinuity

Running variable

Res

pons

e

Raises thorny extrapolation/linearity issues at large |xi|. Stanford statistics seminar

Page 15: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 15

Regression discontinuity

Famous example:

x = test score

z = merit scholarship iff x > τ

y = went to grad school

then logistic regression.

RDD is the second most believable causal inference method.

Stanford statistics seminar

Page 16: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 16

Tie-breaker designPick cutoffs A 6 B, then

zi =

1, xi > B

−1, xi 6 A

random, A < xi < B

−1.0 −0.5 0.0 0.5 1.0

NO YES50 : 50

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

A B

Extreme cases

1) x1 < A = B < xN =⇒ RDD

2) A < x1 6 · · · 6 xN < B =⇒ RCT

Also called “cutoff designs” Cappelleri & Trochim Stanford statistics seminar

Page 17: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 17

Examplesx z Ref

Reading ability Remedial English class Aiken et al. (1998)

Student ranking Post secondary financial aid Angrist et al (2014)

Composite prognostic Inpatient rehab Havassy

Lanarkshire milk experiment

Student (1931)

Maybe a tie-breaker would have worked.

Stanford statistics seminar

Page 18: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 18

Tie-breakers∆ = Fraction in RDD between Blue dashed lines

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

9498

102

106

RCT: Delta = 1

x

Out

com

e

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

9510

010

5

RDD: Delta = 0

x

Out

com

e

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

9510

010

5

Delta = 1/3

x

Out

com

e

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−1.0 −0.5 0.0 0.5 1.0

9510

010

5

Delta = 2/3

x

Out

com

e

Stanford statistics seminar

Page 19: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 19

Two-line regression

E(y) = β0 + β1x+ β2z + β3xz

X =

1 x1 z1 x1z1

1 x2 z2 x2z2

......

......

1 xN zN xNzN

Var(β̂) = (XTX )−1σ2

Pr(zi = 1) =

0, xi 6 −∆

1/2, |xi| < ∆

1, xi > ∆

Stanford statistics seminar

Page 20: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 20

Integral approximation

1

NXTX ≈

1 x z xz

1 1 0 0 φ(∆)

x 0 1/3 φ(∆) 0

z 0 φ(∆) 1 0

xz φ(∆) 0 0 1/3

where

φ(∆) ≡ 1

2

∫ 1

−1

xE(z | x) dx

=1

2

∫ −∆

−1

(−x) dx+1

2

∫ ∆

−∆

0 dx+1

2

∫ 1

xdx

=1−∆2

2

The error above is Op(1/√N).

Even less under stratification. Stanford statistics seminar

Page 21: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 21

Rearrange XTX/N

1 zx z x

1 1 φ · ·zx φ 1/3 · ·z · · 1 φ

x · · φ 1/3

(using · for 0)

N ×Var

β̂0

β̂3

β̂2

β̂1

=1

1/3− φ2

1/3 −φ · ·−φ 1 · ·· · 1/3 −φ· · −φ 1

σ2

φ = φ(∆) =1−∆2

2

Stanford statistics seminar

Page 22: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 22

NormalizationThe design choice is which ∆ to use.

That comes down toVar(cTβ̂; ∆1)

Var(cTβ̂; ∆0)

for various vectors c.

Cancellation

σ2 cancels in this ratio.

So we fix σ2 = 1.

Stanford statistics seminar

Page 23: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 23

ImpactChanging z from−1 to +1 increases E(y) by(

β0 + β1x+ β2 + β3x)−(β0 + β1x− β2 − β3x

)= 2(β2 + xβ3)

So β2 and β3 are important.

So is x.

(If we didn’t already know)

Variance

Var(2(β̂2 + xβ̂3)) = · · · = 16(1 + 3x2)

1 + 3∆2(2−∆2)

Stanford statistics seminar

Page 24: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 24

Variance vs ∆

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

12

Variance vs Delta0 = regression discontinuity, 1 = experiment

Delta

N x

var

ianc

e

● ● ●●

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ●

● Coefficient of zCoefficient of z*x

Var(β̂2) = 3Var(β̂3) all ∆.

Stanford statistics seminar

Page 25: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 25

RCT vs RDD

Method ∆ Var(β̂2) Var(β̂3)

Regression discontinuity 0 4/N 12/N

Experiment 1 1/N 3/N

An RDD with N observations is as good as an RCT with N/4 observations.

Section 6 of Jacob, Zhu, Somers & Bloom (2012) has this and more observations.

Stanford statistics seminar

Page 26: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 26

Var(2(β̂2 + xβ̂3))

0.0 0.2 0.4 0.6 0.8 1.0

510

2050

Variance of treatment effect vs xLinear regression

Target location x

N x

Var

ianc

e

Top = reg. discontinuity, Delta=0Bottom = experiment, Delta=1Step size 0.1

The worst RCT (at x = 1) is better than the best RDD (at x = 0).

Stanford statistics seminar

Page 27: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 27

Gain from the sampleThe expected payoff per customer in the data set is

1

N

N∑i=1

(β0 + β1xi + β2E(zi) + β3xiE(zi)

)

E(zi) =

−1, xi < −∆

0, |xi| 6 ∆

1, xi > ∆

So plan with

g(∆) ≡ 1

2

∫ −∆

−1

(β0 + β1x− β2 − β3x) dx+1

2

∫ ∆

−∆

(β0 + β1x) dx

+1

2

∫ 1

β0 + β1x+ β2 + β3x dx

= β0 + β3(1−∆2)/2.

Stanford statistics seminar

Page 28: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 28

The tradeoffShort term gain per customer

g(∆) = β0 +β3(1−∆2)

2

Define the information gain per customer

info(∆) ≡ 1

NVar(β̂3)=

1

3− (1−∆2)2

4

Balance

v(∆) ≡ g(∆) + λ× info(∆)

= β0 + β31−∆2

2+ λ(1

3− (1−∆2)2

4

)

NB: β0 does not affect our choice of ∆.

Stanford statistics seminar

Page 29: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 29

Optimal ∆β3 is the coefficient of xiziλ is value of information

∆∗ =

1, β3/λ 6 0√

1− β3/λ, 0 6 β3/λ 6 1

0, 1 6 β3/λ.

0.0 0.2 0.4 0.6 0.8 1.0 1.2

0.0

0.2

0.4

0.6

0.8

1.0

Present / future value

Opt

imal

Del

ta

Stanford statistics seminar

Page 30: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 30

Value of future information

It is really hard to quantify the value of that information.

Maybe harder than eliciting a prior.

Stanford statistics seminar

Page 31: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 31

Simpler approachLet ∆0 be smallest ∆ with efficiency ρ vs RCT ∆ = 1.

We know that 1/4 6 ρ 6 1.

ρ =Var(2(β̂2 + xβ̂3) | ∆ = 1)

Var(2(β̂2 + xβ̂3) | ∆ = ∆0)= · · · = 1 + 3∆2

0(2−∆20)

1 + 3(2− 1)

Solve a quadratic equation for ∆20

3∆40 − 6∆2

0 + 4ρ− 1 = 0

=⇒ ∆0 =

√1−

√1− (4ρ− 1)/3

Stanford statistics seminar

Page 32: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 32

Minimal ∆ for efficiency ρ

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Efficiency demanded

Min

imal

Del

ta

ρ ∆0

0.99 0.94

0.9 0.80

0.8 0.70

0.7 0.61

0.6 0.52

Stanford statistics seminar

Page 33: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 33

Gaussian running variableFor xi = Φ−1

(i−1/2

n

),

experiment on central ∆N observations, then

1

NXTX ≈

1 zx z x

1 1 φG 0 0

zx φG 1 0 0

z 0 0 1 φG

x 0 0 φG 1

φG = avg(xiz(xi)) = · · · = 2ϕ

(Φ−1

(1 + ∆

2

))After some algebra, the RDD efficiency vs RDD is

π

π − 2

.= 2.75

Goldberger (1972).

Stanford statistics seminar

Page 34: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 34

CarpentryWe don’t have to keep p(x) ≡ Pr(Z = 1 | x) ∈ {0, 1/2, 1}.

−1.0 −0.5 0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Running variable x

P(

Z=

1 | x

)

Carpentry doesn’t really help

(or hurt).

Stanford statistics seminar

Page 35: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 35

Carpentry ctdUnder symmetry,

p(−x) = 1− p(x),

the shape of the curve doesn’t matter, only

zx ≡ 1

2

∫ 1

−1

xE(Z | x) dx =1

2

∫ 1

−1

x(2p(x)− 1) dx > 0

short term gain is

β0 + β3zx

information proportional to1

3− zx2

Asymmetric p

Replace by a symmetric one. That reduces

diag(Var(β̂))

keeping gain the same. Stanford statistics seminar

Page 36: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 36

Two quadratics

E(Y ) = β0 + β1x+ β2z + β3xz+β4x2 + β5x

2z

1

NXTX .

=

1 zx x2 z x zx2

1 1 φ1 1/3 · · ·

zx φ1 1/3 φ3 · · ·

x2 1/3 φ3 1/5 · · ·

z · · · 1 φ1 1/3

x · · · φ1 1/3 φ3

zx2 · · · 1/3 φ3 1/5

φ1 = (1−∆2)/2 φ3 = (1−∆4)/4

Stanford statistics seminar

Page 37: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 37

Treatment effect

E(y | x, z = 1)− E(y | x, z = −1) = 2(β2 + xβ3 + x2β5)

0.0 0.2 0.4 0.6 0.8 1.0

1050

200

1000

Variance of treatment effect vs xQuadratic regression

Target location x

N x

Var

ianc

e

Top = reg. discontinuity, Delta=0Bottom = experiment, Delta=1Step size 0.1

Note log scale.

Gelman & Imbens (2017) warn against polynomial RDD.

Stanford statistics seminar

Page 38: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 38

More elaborate modelsFor a feature vector F = F (x) ∈ Rd including intercept

E(y) = FTβ + zFTγ

take

zi =

1, θTFi > ∆

random, |θTFi| < ∆

−1, θTFi 6 −∆.

Now

XTX =

A B

B A

, A =∑i

FiFTi , B =

∑i

wiFiFTi ,

for

wi = E(zi | Fi) =

1, θTFi > ∆,

2p− 1, |θTFi| < ∆,

−1, θTFi 6 −∆.Stanford statistics seminar

Page 39: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 39

Inverting block matrices

Var(γ̂) = Var(β̂) = (A−BA−1B)−1σ2

Cov(γ̂, β̂) = −A−1B(A−BA−1B)−1σ2

We could pick θ, F and p by brute force search with Monte Carlo as an inner loop.

Here matrix algebra can replace the inner Monte Carlo.

Big ∆ better

For large enough ∆ we get B = 0.

Smaller ∆ raises BA−1B and hence Var(β̂).

Stanford statistics seminar

Page 40: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 40

Non-central regionsThe airline won’t give upgrades to half of their passengers.

They are more likely to do:

z =

1, top few

random, next few

−1, majority.

Of the majority, only retain those where the linear model is ok.

Stanford statistics seminar

Page 41: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 41

Two linesExperiment in range (A,B):

Method A B Var(β̂3)

Full experiment −1.00 1.00 3.00/N

RDD 0.00 0.00 12.00/N

Expt on bottom 50% −1.00 0.00 13.09/N

Expt on second 10% 0.60 0.80 137.56/N

Top 10% only 0.80 0.80 751.03/N

Top 15% only 0.70 0.70 223.44/N

Top 20% only 0.60 0.60 95.21/N

Stanford statistics seminar

Page 42: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 42

Followup directions• This x can be the output of a prediction algorithm based on many variables.

So how does the sampling plan help fit the next model?

I.e., how to handle concomitants?

• What about binary responses y?

Logistic regression efficiency actually depends on the underlying β.

Usual approaches are Bayesian.

Stanford statistics seminar

Page 43: Tuning tie-breaker experiments 1 Tuning the tie …statweb.stanford.edu/~owen/pubtalks/tiebreaker.pdfTuning tie-breaker experiments 1 Tuning the tie-breaker design Art B. Owen , Stanford

Tuning tie-breaker experiments 43

Thanks• Hal Varian, co-author

• Google, environment

Stanford statistics seminar