social experiments in the labor market...2016/07/06 · experiments have addressed core labor...

1

SocialExperimentsintheLaborMarket

JesseRothstein*

UniversityofCaliforniaBerkeleyandNBER

TillvonWachter

UniversityofCaliforniaLosAngelesandNBER

ChapterpreparedfortheHandbookofFieldExperiments.

July2016

Abstract

Large-scalesocialexperimentswerepioneeredinlaboreconomics,andarethebasisformuchofwhatweknowabout topics ranging from theeffectof job training toincentivesforjobsearchtolaborsupplyresponsestotaxation.Randomassignmenthas provided a powerful solution to selection problems that bedevil non-experimentalresearch.Nevertheless,manyimportantquestionsaboutthesetopicsrequire going beyond randomassignment. This applies to questions pertaining tobothinternalandexternalvalidity,andincludeseffectsonendogenouslyobservedoutcomes,suchaswagesandhours;spillovereffects;siteeffects;heterogeneity intreatmenteffects;multipleandhiddentreatments;andthemechanismsproducingtreatment effects. In this Chapter, we review the value and limitations ofrandomized social experiments in the labor market, with an emphasis on thesedesign issues and approaches to addressing them. These approaches expand therange of questions that can be answered using experiments by combiningexperimental variation with econometric or theoretical assumptions. We alsodiscusseffortstobuildthemeansofansweringthesetypesofquestionsintotheexante design of experiments. Our discussion yields an overview of the expandingtoolkitavailabletoexperimentalresearchers.

*Contact:[email protected],tvwachter@econ.ucla.edu.WethankBenSmithandAudreyTiewforsterlingresearchassistance,andAngusDeaton,LarryKatz,JeffSmith,andconferenceparticipantsforhelpfulcomments.

2

I. Introduction.....................................................................................................................3

II. WhatareSocialExperiments?HistoricalandEconometricBackground10a. APrimerontheHistoryandTopicsofSocialExperimentsintheLabor

Market....................................................................................................................................10b. Socialexperimentsasatoolforprogramevaluation............................................15

i. Thebenchmarkcase:Experimentswithperfectcompliance..............................16ii. Imperfectcomplianceandthelocalaveragetreatmenteffect............................19

c. Limitationsoftheexperimentalparadigm...............................................................21i. SpilloverEffectsandtheStableUnitTreatmentValueAssumption................22ii. Endogenouslyobservedoutcomes.................................................................................22iii. SiteandGroupEffects...........................................................................................................23iv. TreatmentEffectHeterogeneityandExternalValidity..........................................23v. HiddenTreatments................................................................................................................24vi. MechanismsandMultipleTreatments..........................................................................25

d. Quasi-experimentalandStructuralResearchDesigns.........................................25III. Amorethoroughoverviewoflabormarketsocialexperiments...............26

a. LaborSupplyExperiments.............................................................................................27b. Trainingexperiments.......................................................................................................34c. JobSearchAssistance.......................................................................................................44d. PracticalAspectsofImplementingSocialExperiments........................................51

IV. GoingBeyondTreatment-ControlComparisonstoResolveAdditional

DesignIssues..........................................................................................................................54a. SpillovereffectsandSUTVA...........................................................................................56

i. Addressingtheissueexpost..............................................................................................57ii. Addressingtheissueexantethroughthedesignoftheexperiment...............60

b. Endogenouslyobservedoutcomes..............................................................................62i. Addressingtheissueexpost..............................................................................................64

Parametricselectioncorrections......................................................................................................65Non-andsemi-parametricselectioncorrections......................................................................66

ii. Addressingtheissueexantethroughthedesignoftheexperiment...............72c. Siteandgroupeffects.......................................................................................................74

i. Addressingtheissueexpost..............................................................................................76ii. Addressingtheissueexantethroughthedesignoftheexperiment...............83

d. Treatmenteffectheterogeneityandexternalvalidity..........................................84i. Addressingtheissueexpost..............................................................................................85ii. Addressingtheissueexantethroughthedesignoftheexperiment...............90

e. Hiddentreatments............................................................................................................94i. Addressingtheissueexpost..............................................................................................95ii. Addressingtheissueexantethroughthedesignoftheexperiment...............97

f. Mechanismsandmultipletreatments........................................................................98i. Addressingtheissueexpost..............................................................................................99ii. Addressingtheissueexantethroughthedesignoftheexperiment.............110

V. Conclusion...................................................................................................................112

3

I. Introduction

Thereisaverylonghistoryofsocialexperimentationinlabormarkets.

Experimentshaveaddressedcorelabormarkettopicssuchaslaborsupply,job

search,andhumancapitalaccumulation,andhavebeencentraltotheacademic

literatureandpolicydiscussion,particularlyintheUnitedStates,formanydecades.

Bymanyaccounts,thefirstlarge-scalesocialexperimentwastheNewJersey

IncomeMaintenanceExperiment,initiatedin1968bytheU.S.OfficeofEconomic

Opportunitytotesttheeffectofincometransfersandincometaxratesonlabor

supply.Wheremanysubsequentexperimentshavebeendesignedtoevaluatea

singleprogramortreatmenteach,theIncomeMaintenanceExperimentwas

intendedinsteadtomapoutaresponsesurface.Participantswereassignedtoa

controlgrouportooneofeighttreatmentarmsthatvariedintheincomeguarantee

toafamilythatdidnotworkandtherateatwhichthiswastaxedawayasearnings

rose.Threefollow-upexperiments–inruralNorthCarolinaandIowa;inGary,

Indiana;andinSeattleandDenver–withvaryingbenefitlevelsandtaxrates(and,

inSeattleandDenver,across-cuttingsetofcounselingandtrainingtreatments)

werebegunbeforedatacollectionfortheNewJerseyexperimentwascomplete.

Otherearlylabormarketexperimentsexaminedtheeffectsofjobsearch

encouragementforUnemploymentInsurancerecipients;jobtrainingandjobsearch

programs;subsidizedjobsforthehard-to-employ;andprogramsdesignedtopush

welfarerecipientsintowork(GreenbergandRobins,1986;Gueron,thisvolume).

Thesetopicshavebeenreturnedtorepeatedlyintheyearssinceasresearchers

4

havesoughttotestnewprogramdesignsortobuildonthelimitationsofearlier

research.Therehavealsobeenmanysmaller-scaleexperiments,onbonuspay

schemes,managementstructure,andotherfirm-levelpolicies.1

Fromthebeginning,theuseofrandomassignmentexperiments(alsoknown

asrandomizedcontrolledtrials,orRCTs)hasbeencontroversialinlabor

economics.2Theprimary,powerfulappealofRCTsisthattheysolvetheassignment,

orselection,probleminprogramevaluation.Innon-experimentalstudies(also

knownas“observational”studies),programparticipantsmaydifferinobservedand

unobservedwaysfromthosewhodonotparticipate,andeconometricadjustments

forthisselectionrelyonunverifiable,oftenimplausibleassumptions(Lalonde1986;

FrakerandMaynard1987;thoughseealsoHeckmanandHotz,1989).Withawell-

executedrandomizationstudy,however,thetreatmentandcontrolgroupsare

comparablebydesign,makingitstraightforwardtoidentifytheeffectofthe

treatmentunderstudy.

Butsetagainstthisveryimportantadvantageareanumberofdrawbacksto

experimentation.Earlyon,itwasrecognizedthatRCTscanbeveryexpensiveand

hardtoimplementsuccessfully.Forexample,itisnotalwayspossibletoensurethat

everyoneassignedtoreceiveatreatmentreceivesafulldose,whilethoseassigned

tothecontrolgroupreceivenone,thoughthisistheexperimentalideal.Sometimes

itisnotfeasibletocontrolparticipants’behavior,andmanyparticipantsdeviate

1Weomithereauditstudiesaimedatuncoveringdiscriminationinthelabormarketandelsewhere(e.g.,BertrandandMullainathan2004;Kroft,Lange,andNotowidigdo2013;Farber,Silverman,andvonWachter2015).ThesearecoveredbyBertrandandDuflo,elsewhereinthisvolume.2ForrecentcriticismsofrelianceonRCTswithparticularrelevancetolabormarketstudies,seeDeaton(2010)andHeckman(2010).SeealsoHeckmanandSmith(1995).

5

fromtheirintendedtreatmentassignments.Inothercases,ethical,political,or

operationalconsiderationsmakeitundesirabletolimitaccesstoalternative

treatments.Althoughthiscanbepartlyaddressedwithinthebasicexperimental

paradigm,itdoeslimitwhatcanbelearned.

Moregenerally,whilerandomassignmentsolvestheassignmentproblem,it

aloneisnotsufficienttoresolveotherproblemsthatresearchersoftenface.Many

questionsofinterestcanbeansweredonlywithsomethingmorethanthefamiliar

two-armedrandomizedcontroltrial–amorecomplexexperimentaldesign,the

augmentationofexperimentaldatawithadditional,non-experimentaldata,

theoreticallygroundedassumptions,oracombinationofthese.Weconsidera

numberofsuchquestionsinthischapter.Theseinclude:

• Questionsaboutimpactsonendogenouslyobservedoutcomes.Considerthe

effectofjobtrainingonwages.Becausewagesareobservedonlyforthose

whohavejobs,andbecausetrainingmayaffectthelikelihoodofworking,the

contrastinmeanwagesbetweenrandomlyassignedtreatmentandcontrol

groupsdoesnotcompareliketolikeandthusdoesnotsolvetheassignment

problemforthisoutcome.

• Questionsaboutspilloversandmarket-levelimpacts.Whenoneindividual’s

outcomedependsonothers’treatmentassignments,experimentalestimates

oftreatmenteffectscanbemisleadingaboutaprogram’soveralleffect.Inthe

contextoflabormarketprograms,anincreaseinjobsearcheffortbya

treatmentgroupmaylowerthecontrolgroup’sjob-findingchances,leading

toanoverstatementoftheprogram’stotaleffect(whichwillitselfdepend

6

importantlyonthescaleatwhichtheprogramisimplemented).Similar

issuescanariseifsubjectscommunicatewitheachother,leadingtoadilution

intreatmentcontrastswhenaccesstoinformationispartofthetreatment.

• Questionsaboutheterogeneityoftreatmenteffects.Experimentshavelimited

abilitytoidentifyheterogeneityoftreatmenteffects,especiallyif

heterogeneityisnotfullycharacterizedbywell-definedobservable

characteristics.Thisisoftenoffirst-orderimportance,asinmanycasesthe

relevantquestionisnotwhethertoofferaprogram(e.g.,jobtraining)butfor

whomtomakeitavailable,orwhichversionsoftheprogramaremost

effective(andwhy).

• Questionsaboutgeneralizability.Whileinidealcasesexperimentshavehigh

internalvalidityfortheeffectofthespecificprogramunderstudyonthe

specificexperimentalpopulation,inthesettinginwhichitisstudied,they

mayhavelimitedexternalvalidityforgeneralizationstootherlocations,to

otherprograms(oreventootherimplementationsofthesameprogram),or

tootherpopulations.Forexample,areemploymentbonusprogrammayhave

averydifferenteffectinafull-employmentlocaleconomythanwhenthe

localareaisinarecession,orthesameprogramofferedindifferentsitesmay

havedramaticallydifferenteffectsduetovariationinlocalprogram

administrationorcontext.

• Questionsaboutmechanisms.Manyquestionsofinterestinlabormarket

researchdonotreducetotheeffectsofspecific“treatments”onobserved

outcomes,butrelate,atbest,tothemechanismsbywhichthoseeffectsarise.

7

Forexample,animportantquestionfortheanalysisofunemployment

insuranceprogramsiswhethertheunemployedareliquidityconstrainedor

whethertheycanborroworsavetosmoothconsumptionoptimallyacross

periodsofemploymentandunemployment.Andimportantquestionsabout

thedesignofwelfareanddisabilitypolicyturnonwhetherobservednon-

employmentisduetohighdisutilityofworkortomoralhazard.Ineachcase,

wewanttodistinguishincomeandsubstitutioneffects,adistinctionthatisin

generalnotidentifiedfromthesimpleeffectofatreatmentonanobserved

outcome.Carefullydesignedexperimentscanshedlightonthephenomenaof

interest,butmaynotbeableanswerthemdirectly.

Tobeclear,allofthesequestionsarethornyunderanymethodological

approach,andaregenerallynoeasiertoanswerinquasi-experimentalstudiesthan

inrandomizedexperiments.Onevocalgroupofcriticsofexperimentationpointsto

theimportanceofidentifyingthe“structural”parameters–afullcharacterizationof

programenrollmentdecisionsandthebehavioralprocessesthatleadtothe

observedoutcomes–thatdetermineprogramselectionandimpacts(see,e.g.,Keane

2010).Inprinciple,manyofthedesignissuesabovecouldindeedbeavoidedor

addressedwithestimatesoftheunderlyingstructuralparameters.Butthese

structuralparametersaredifficulttomeasure.So-calledstructuralmethods

generallytradeoffinternalvalidityinpursuitofmoreexternalvalidity,butastudy

thatfailstosolvetheassignmentproblemisunlikelytobeanymoregeneralizable

thanitisinternallyvalid.

8

Unfortunately,whileexperimentscansometimesbedesignedtoidentifya

fewkeystructuralparameters,oratleastimportantcombinationsofthem,itis

rarelypossibletodesignanexperimentthatdirectlyidentifiesallofthestructural

parametersofinterest.Thus,therecanbevalueincombiningthetwoparadigms.

Thisinvolvesimposinguntestableassumptionsabouttheprocessesofinterest,

whilestillrestingonexperimentation(orotherempiricalmethodsthatofferhigh

internalvalidity)wherepossible.Theadditionalassumptionscandramatically

enhanceexternalvalidityiftheyarecorrect,thoughiftheyareincorrect–andthisis

generallyuntestable–bothinternalandexternalvaliditysuffer.

Thecurrentfrontierforlabormarketresearch–asinotherfields–thus

involvescombiningthebestfeaturesofthetwoapproachestopermitanswersto

morequestionsthanareaddressedbysimpleexperiments,whileretainingatleast

someofthecredibilitythattheseexperimentscanprovide.

Inthischapter,wediscussavarietyofquestionscommoninlabormarket

researchthatrequirethissortofapproach.Wedistinguishtwobroadstrategiesfor

answeringthesequestionsusingexperimentaldata.First,onecanaugment

traditionalrandomizedexperimentsbyimposingadditionalstructure,either

economicoreconometric,afterthefact.Inmanycases,theamountofstructure

required,andthestrengthoftheadditionalassumptionsthatarenecessary,issmall

relativetothevalueoftheresultsthatcanbeobtained.Ourreviewgivesasnapshot

ofanexpandingtoolkitwithwhichresearcherscanaddressawiderrangeof

9

questionsbasedonvariationfromRCTs.3

Thesecondbroadstrategyistoaddressthelimitationsoftraditional

experimentsexante,viadesignoftheexperimentalinterventionorevaluationitself.

Inmanycases,cleverdesignchoices–multipletreatmentarms,carefullydesigned

stratification,orrandomizationbothacrossandwithingroups,forexample–can

allowforricherconclusionsthanwouldbepossibleviatraditionalexperiments.

Thissortofapproachhasalonghistory–indeed,theveryfirstlarge-scalesocial

experiments,theincomemaintenanceexperimentsofthelate1960sandearly

1970s,canbeseenasaversionofit.Butthependulumswungawayforalongtime,

andresearchershaveonlyrecentlybeguntoreturntoexperimentaldesignsthat

synthesizerandomexperimentalvariationwithmorestructuralmodeling.Recent

examplesofthisapproachincludeKling,Liebman,andKatz(2007)whouseitto

addresspotentialbiasesfromendogenousattrition,andCrepon,Duflo,Gurgand,

Rathelot,andZamora(2013),whoquantifytheimportanceofspillovers.Inour

view,approachesliketheserepresentthecurrentresearchfrontier.

Therestofthischapterproceedsasfollows.InSectionII,wegivebrief

overviewsofthehistoryofsocialexperimentsinthelabormarketandofthevalue

ofRCTsforsolvingselectionproblems,andsummarizepotentialdesignissuesthat

remainevenwithrandomassignment.InSectionIII,wereviewthetypesof

programsandquestionsthathavebeenanalyzed,theirmainfindings,andpractical

3Thisincludesanalysesofissuessuchasendogenouslyobservedoutcomes(e.g.,AhnandPowell1993,Grogger2005,Lee2009);hiddentreatments(e.g.,KlineandWalters2014,Feller,Grindal,Miratrix,andPage2014,Pinto2015);heterogenoustreatmenteffects(e.g.,KlineandWalters2014,HeckmanandVytlacil2005);andmultipletreatmentsandmechanisms(e.g.,CardandHyslop2005,Schmieder,vonWachter,andBender2016,DellaVigna,Lindner,ReizerandSchmieder2016).

10

challengesthatlabormarketexperimentsoftenconfront.SectionIVdiscusses

approachestoaddressingthedesignchallengesfromSectionIIandthereby

expandingtherangeofquestionsthatcanbeanswered.Wediscussbothexanteand

expostapproachestoresolving(oratleastameliorating)theissues.SectionVoffers

someconcludingcomments.

II. WhatareSocialExperiments?HistoricalandEconometricBackground

a. APrimerontheHistoryandTopicsofSocialExperimentsintheLabor

Market

Astheso-called“credibilityrevolution”hassweptoverempiricaleconomics

inthelastgeneration,theroleandstatusofexperimentalevidencehasgrown.Over

thesameperiod,thefieldofexperimentaleconomicshassegmented–Listand

Rasul(2011)andHarrisonandList(2004),forexample,drawcarefuldistinctions

betweensocialexperimentsandartefactual,natural,andframedfieldexperiments.

Briefly,socialexperimentstendtobeconductedatalargescaleandtofocusonthe

overallevaluationofpoliciesorprograms,oftenalreadyinplace.Bycontrast,the

varioustypesoffieldexperimentsaretypicallysmallerinscaleandaremorelikely

touseartificialtreatments(e.g.,behavioralgames)thatwouldnotcorrespond

directlytoanyspecificpolicybutaredesignedprimarilytouncoverparticular

behavioraltendenciesorparameters.

Althoughallofthemanyvarietiesofexperimentshavebeenusedtostudy

topicsrelatedtothelabormarket,thischapterfocusesonlarge-scalesocial

experiments,whichinourviewhavehadthelargestimpactonpolicy.

11

Thesocialexperiment/fieldexperimentdistinctioncorrespondsroughlyto

thedistinctiondrawnabovebetweenprogramevaluationandtheidentificationof

structuralparameters–socialexperimentsare,atroot,evaluationsofprogramsor

policies,wherefieldexperimentsaredesignedprimarilytouncoveroneormore

specificstructuralparameters.4Aswediscussedabove,thisdistinctionislessclear

thanitoncewas–scholarsareincreasinglydrawingonprogramevaluationsamples

tounderstandstructuralrelationships,andusingstructuralparameterstoinform

thedesignandinterpretationofprogramevaluations.Butwhilethedistinctionhas

beenblurred,ithasnotbeenobliterated,andnearlyallofthesocialexperiments

thatwediscussinthischapteraredesigned,atleastinpart,toevaluateprograms

thateitherhavebeenormightplausiblybeimplementedinroughlytheformusedin

theexperiment.

Another,relateddistinctionhastodowiththecommunitiesthatconductthe

differenttypesofexperiments.Socialexperimentsaretypicallyconductedatalarge

scalebyanorganizationthatspecializesinthis–historically,the“BigThree”players

(GreenbergandShroder2004)havebeenMathematica,theManpower

DemonstrationResearchCorporation(MDRC),andAbtAssociates–andhasbeen

hiredbyagovernmentagency(mostnotablyOPDR,theOfficeforPolicy

DevelopmentandResearchwithintheDepartmentofLabor’sEmploymentand

TrainingAdministration,andASPE,theAssistantSecretaryforPolicyand

EvaluationwithintheDepartmentofHealthandHumanServices)oralarge

4Klingetal.(forthcoming)refertoexperimentsaimedatunderstandingmechanismsratherthanatevaluatingprogramsas“mechanismexperiments.”Gueron(thisvolume)discussesthetensionbetweenprogramevaluationandunderstandingmechanismsinearlysocialexperiments.

12

foundation(e.g.,theFordFoundation)foraspecificstudy.Bycontrast,field

experimentsaremoreoftenoverseenbyindividualscholarsandtheirstudents,

perhapswiththecooperationofacompanyorgovernmentagencythatisnot

otherwisecloselyinvolvedinthedesign.

Thedifferencesinthecompositionandorganizationalstructureofsocial

experimentalandfieldexperimentalresearchteamsrelatetothescopeofthework

beingcarriedout.Aresearchteamimplementingasocialexperimentfacesa

numberofpracticalandimplementationchallengesthatarelargelyabsentfrom

laboratoryexperimentsandcloselyrelatedtypesoffieldexperiments.Researchers

rarelyhaveaccesstoasamplingframecorrespondingtothepopulationofinterest;

facepractical,ethical,andpoliticaldifficultiesinrandomlyassigningaccessto

treatment;havelimitedornocontrolovertreatmentalternativesthatcontrol

participantsmayobtainoroverthespecificimplementationofthetreatment,which

isoftenunderthecontrolofanagencyratherthantheexperimenter;andlackready

accesstooutcomemeasuresforuseinassessingtheprogram’simpact(oreventoa

well-definedsetofoutcomesofinterest).Addressingthesechallengesoftenrequires

alargestafftocollectpre-andpost-treatmentdata,tominimizeattritionbetween

surveywaves,andtomonitorboththerandomizationoftreatmentandthefidelity

oftreatmentdeliverytotheprogrammodel.Therequiredscaleisoftenoutofthe

reachofindividualresearchers.

Mostauthorsagreethatthefirstlarge-scalesocialexperimentinthelabor

marketwastheNewJerseyIncomeMaintenanceExperiment(hereafter,IME;thisis

alsoknownastheNewJerseyNegativeIncomeTaxexperiment),firstinitiatedin

13

1968andextendedinvariouswaysinotherlocationsoverthenextseveralyears.

Consistentwiththeabovedichotomy,thiswasalarge-scaleexperimentthatwas

initiatedbytheOfficeofEconomicOpportunity(OEO),thenanindependentagency

withintheFederalgovernmentthatplayedaleadroleintheWaronPoverty.Butin

otherwaysitmorecloselyresembleswhatwouldnowbecalledafieldexperiment,

albeitatamassivescale:Itwasfirstconceptualizedbyanindividualresearcher,

HeatherRoss,whoproposedittoOEO,anditwasdesignednottoevaluatea

specific,welldevelopedprogrambuttomapoutthesurfaceoflaborsupply

responsestoarangeoftaxparametersandtherebytouncoversemi-structural

economicparameters,theincomeandsubstitutioneffectsofchangesintaxrates.

NearlyallanalysesofIMEdatawentbeyondsimpletreatment-control

contrasts,usingthedatatoestimateparametricorsemi-parametriclaborsupply

models.5Thesemodelsoftenincorporatedcorrectionsfortheselectionintroduced

bynonparticipationthatreliedonstrongfunctionalformassumptions(e.g.,Tobits)

andinsomecasesalsorestedonstructuralspecificationsoftheresponseto

nonlineartaxschedules.Inmanyofthesestudies,thetreatmentandcontrolgroups

wereeffectivelypooledanditcanbedifficulttoidentifytheextenttowhichthe

parametersareidentifiedfromexperimentalvs.non-experimentalvariation.

AnothersenseinwhichtheIMEdivergedfrommuchmodernsocial

experimentalpracticewasinthesourceofoutcomemeasures.Themainoutcome

5Indeed,in1990–sevenyearsafterthefinalexperimentalreportfromthefollow-upSeattle-DenverIncomeMaintenanceExperiment,andaftermanypublishedanalysesofthedata–AshenfelterandPlant(1990)areapparentlythefirsttoreporttheresultsassimplemeansbyrandomlyassignedtreatmentgroup.

14

measuresfortheIMEanalyseswerepaymentsundertheIMEandlaborsupply

measuresdrawnfromparticipants’self-reportsaspartoftheprogram’s

administration.Butasinotherexperiments,manysubjectsfailedtocompletethe

follow-upsurveys.Unfortunately,thedesignoftheIMEprogrammeantthatthe

privatereturnstocontinuedreportingvarieddramaticallywithbothtreatment

statusandendogenousoutcomes,astheincomemaintenancepaymentsweremade

onthebasisofthesereports.Differentialattritionmadetheresultsquitedifficultto

interpret(AshenfelterandPlant1990).

InthewakeoftheIncomeMaintenanceExperiments,thefieldexploded.

Greenberg,Shroder,andOnstott(1999;seealsoGreenbergandShroder2004)

identified21socialexperimentsbetween1962and1974,largelyineducationand

health.Bycontrast,theyidentify52between1975and1982and70between1983

and1996,andmostofthesearedirectlyrelatedtothelabormarket.(Therehasnot

beenassystematicacensusofpost-1996experiments,butthepaceoflargescale

labormarketexperimentsseemstohavedroppedoffsincethen,atleastinthe

UnitedStates.Therehasbeenrapidgrowthofsocialexperimentsineducationover

thisperiod,however.)Greenbergetal.(1999;hereafterGSO)highlightimportant

changesinthepost-1975experiments.IncontrasttotheIME,mostinvolvedonly

oneortwotreatmentarmsplusacontrol,andweredesignedmoreas“blackbox”

evaluationsoftheprogramsencapsulatedinthetreatments–oftenmodificationson

existingprograms(Gueron,thisvolume)–thanaseffortstomapoutaresponse

surface.

15

GSOemphasizethatthevastmajorityoftheexperimentstheyidentified

focusedonlow-incomepopulations,afactthatdoesnotseemtohavechangedsince

theirsurvey.Severaltopicsstandoutascentral:

- Humancapitaldevelopment.Overone-thirdofthestudiesinGSO’ssample

includeatleastonetreatmentarminvolvingasupportedwork

experience,on-the-jobtraining,vocationaleducationortraining,orbasic

education(includingGEDprograms).

- Laborsupply.Anumberofexperimentshaveinvolvedinterventions

aimedatincreasinglaborsupply,includingtheincomemaintenance

experiments,studiesofre-employmentbonusesforunemployment

insurancerecipients,andabroadgroupofwelfare-to-workexperiments

conductedaspartofthemid-1990swelfarereformmovement.

- Jobsearchassistance.Anothercommoncategoryofexperimentsexamines

interventionsaimedatmakingdisadvantagedworkers’jobsearchefforts

moreeffective,throughcounseling,jobclubs,orjobplacementservices.

Thesearenotmutuallyexclusive.Inparticular,anumberofprogramsand

experimentscombinedjobsearchassistancewitheitherjobtrainingorincentivesto

findwork.

b. Socialexperimentsasatoolforprogramevaluation

Randomassignmentsolvestheselectionproblemthatoftenplaguesnon-

experimentalprogramevaluations,andmakesitpossibletogenerateuniquely

credibleevidenceontheeffectsofwell-defined,successfullyimplemented

16

programs.Intheabsenceofrandomassignment,peoplewhoparticipateina

program(thosewhoare“treated”)arelikelytodifferinobservedandunobserved

waysfromthosewhodonotparticipate,andtheeffectofthisselectioncanbe

distinguishedfromthecausaleffectoftheprogramonlyviatheimpositionof

unverifiableassumptionsabouttheselectionprocess.Thisisaveryimportant

advantageoftheexperimentalparadigmoverotherresearchmethodologies(so-

called“observational”comparisons),andwedonotintendtominimizeits

contributionstothefieldofeconomics,publicpolicy,andbeyond.

Butexperimentshavelimitationsaswell–whiletheycanhaveveryhigh

internalvalidity,atcloserinspectionthisistrueonlyforcertaintypesofprograms

andcertaintypesofoutcomes;andeventhentherecanbeotherchallenges,suchas

difficultiesingeneralizingfromtheexperimentalresultstoabroadersetting.

Inthissubsection,wediscussthevalueofexperimentsasameansofsolving

theselectionproblem.Wethendiscusssomeofthelimitationsoftheexperimental

paradigmforprogramevaluationandpolicyanalysis.Ourdiscussiondrawsheavily

ontheAngrist-Imbens-Rubin(1996)“potentialoutcomes”framework.Someofthe

limitationswediscusscanbeaddressedviacarefuldesignoftheexperimental

study,whileothersrequireaugmentingexperimentalmethodswithothertools.We

takeupthesetopicsinSectionIV.

i. Thebenchmarkcase:Experimentswithperfectcompliance

Theappealofrandomizedexperimentsisthattheymaketransparentthe

assumptionsthatpermitcausalinferenceandcreateadirectlinkbetweenthe

implementationoftheexperimentandthekeyselectionassumption.Thesimple

17

contrastbetweenthoserandomlyassignedtoparticipateintheprogramandthose

randomlyexcludedidentifiestheeffectofbeingassignedtoparticipate,subjectonly

totheassumptionthattherandomizationwasconductedcorrectly.Moreover,in

manycasesthiseffectisidenticaltotheeffectoftheprogramonitsparticipants

(knownasthe“effectofthetreatmentonthetreated,”orTOT),whichisoftenthe

mainparameterofinterest;inothercases,itisstraightforwardtoconverttheeffect

ofassignmenttoparticipate(oftenknownasthe“intentiontotreat,orITT,effect)

intoanestimateoftheprogramtreatmenteffectforasubpopulationofinterest.

Theseresultsarewellknown(see,e.g.,AtheyandImbens,thisvolume),and

wedonotreviewthematlengthhere.Butitwillbeusefultohavenotationlater.We

useDonaldRubin’spotentialoutcomesframeworkforcausalinferenceassetforth

inHolland(1986).Weconsidertheevaluationofasimple,well-definedprogram,

suchasanin-classjobtrainingcourseorabonusschemetoencouragerapidreturn

toworkafterajobdisplacement,whereitispossibletoassignindividuals

separatelytoparticipateortobeexcludedfromparticipationintheprogram.6For

eachindividuali,onecanimaginetwopossibleoutcomes:Onethatwouldobtainifi

participatedintheprogram,yi1,andonethatwouldobtainifheorshedidnot

participate,yi0.7Theprogram’scausaleffectonpersoniissimplythedifference

betweentheoutcomewhichwouldobtainifhe/sheparticipatedandthatwhich

6Inthecaseofthebonusscheme,the“treatment”iseligibilityforthebonus,notactualreceipt.7Thisnotationrestsonanassumptionaboutthemechanismsbywhichtheprogramoperates,knownasthe“stableunittreatmentvalueassumption,”or“SUTVA.”WediscussSUTVAatgreaterlengthbelow.

18

wouldobtainifshedidnot,τi=yi1-yi0.Whenτi>0,iwouldhaveahigheroutcomeif

he/sheparticipatedthanifhe/shedidnot;whenτi<0,theoppositeistrue.

LetDibeanindicatorforparticipation,withDi=1ifiactuallyparticipatesin

theprogramandDi=0ifidoesnot.Thesimplestestimatoroftheprogram’seffectis

thecontrastbetweentheaverageoutcomesofthosewhoparticipateandthosewho

donot.Thiscanbewrittenas:

E[yi|Di=1]–E[yi|Di=0]=E[yi1|Di=1]–E[yi0|Di=0]

=E[τi|Di=1]+(E[yi0|Di=1]-E[yi0|Di=0]).

Thus,thesimpleparticipant-nonparticipantcontrastcombinestwodistinct

components:Theeffectofthetreatmentonthetreated,τTOT=E[τi|Di=1],anda

selectionterm,E[yi0|Di=1]-E[yi0|Di=0],thatcapturesthedifferenceinoutcomes

thatwouldhavebeenobservedbetweenthosewhoparticipatedintheprogramand

thosewhodidnot,hadneithergroupparticipated(forexample,hadtheprogram

notexisted).Thissecondtermarisesbecausetheprocessbywhichpeopleselect(or

areselected)intoprogramparticipationmaygeneratedifferencesbetween

participantsandnon-participantsotherthantheirparticipationstatuses.Ifso,the

treatment-controldifferencecannotbeinterpretedasanestimateoftheeffectofthe

program.

Inasimplesocialexperiment,Diisrandomlyassigned.Thisensuresthatthe

distributionsofyi0andτiareeachthesameforthosewithDi=0asforthosewithDi

=1.Thefirstimpliesthattheselectiontermiszero;thesecond,thattheTOTeffect

equalstheaveragetreatmenteffect(ATE),E[τi],inthepopulationrepresentedby

19

thestudysample.Thus,theaveragecausaleffectisidentified,notjustinthetreated

subgroupbutinthelargerpopulation.8

This,inanutshell,isthevalueofrandomizationinprogramevaluation.Ina

simplerandomizedcontroltrial,theidentificationassumptionthatjustifiescausal

inferenceissimplythattherandomizationwascorrectlyexecuted.Ofcourse,inany

finitesampletheremaybedifferencesinthesampleaveragesofy0iorτibetween

treatmentandcontrolgroups.Butthisvariationiscapturedbythestandarderrorof

theexperimentalestimate.Theestimateisunbiased,withmeasurableuncertainty,

solongasthegroupsarethesameinexpectation.

ii. Imperfectcomplianceandthelocalaveragetreatmenteffect

Acomplicationthatoftenarises,andthatwillbecentraltosomeofour

discussionbelow,isthatitisnotalwayspossibletocontrolsubjects’program

participation.Somesubjectswhoareassignedtoreceivejobtrainingmaynotshow

uptotheircourse,whileotherswhoareassignedtothecontrolgroup,andthusnot

toreceivetraining,mayfindanotherwayintotheprogram.Thiscanbeformalized

byintroducinganadditionalvariable,Zi,representingtheexperimenter’sintention

forindividuali:AnindividualwithZi=1isintendedtobeserved,andonewithZi=

0isnottobe.ZiisrelatedtoDi,butimperfectly:Some(non-randomlyselected)

individualswhoareassignedZi=1willwindupwithDi=0(e.g.,thosewhofailto

8Thisholdsiftheentirepopulationofinterestispartoftheexperiment.Ifthestudysampleisnotrepresentativeofthebroaderpopulation,theATEidentifiedwillbelocaltothesubpopulationrepresentedbythesample.

20

arrivefortheirassignedtrainingcourse),andotherswhoareassignedZi=0will

windupwithDi=1,(e.g.,thosewhotalktheirwaypasttheprogramscreener).

Withpartialcompliance,theexperimentidentifiesneithertheaverage

treatmenteffect(ATE)northeaverageeffectofthetreatmentonthetreated(TOT).

Rather,thebestthatcanbeidentifiedisthelocalaveragetreatmenteffect,orLATE,

forthesubgroupofexperimentalsubjectswhocomplywiththeirexperimental

assignment.Specifically,letDi0representtheindividual’streatmentstatusif

assignedZi=0andDi1representthetreatmentstatusifassignedZi=1.The

“complier”subpopulationisdefinedasthosewithDi0=0andDi1=1–thosewho

receivethetreatmentifandonlyiftheyareassignedtoreceiveit.Thecontrast

betweentheaverageoutcomesofthoseassignedtoreceiveandnottoreceive

treatmentisthen:

E[yi|Zi=1]–E[yi|Zi=0]=Pr{Di0=0,Di1=1}*E[τi|Di0=0,Di1=1].9

Thisisknownasthe“intentiontotreat”(ITT)effect.Thefirsttermisthecomplier

shareoftheexperimentalpopulation;thesecondisthelocalaveragetreatment

effect(LATE)forcompliers.

Inmanycases,theITTistheeffectofprimaryinterest.Itrepresentsthe

actualeffectofofferingaccesstotheprograminthesettinginwhichtheexperiment

takesplace.Often,itisonlypossibletomanipulatetheoptiontoparticipate

(consider,forexample,theofferofjobtraining–onecanneverforceindividualsto

9Weassumehere,asinnearlyallanalysesofexperimentswithpartialcompliance,thatthereareno“defiers”whoreceivethetreatmentifandonlyiftheyareassignednottoreceiveit(Di0=1andDi1=0).

21

participateinatrainingprogram),sotheeffectofmanipulatingthisofferisthekey

parameterforevaluationoftheprogramsunderconsideration.

Inothercases,however,onemightwanttoidentifytheeffectofprogram

participation(asdistinctfromtheoffertoparticipate).OnecanrecovertheLATEfor

compliersbydividingtheITTbythecompliershare,whichcanbeidentifiedasE[Di

|Zi=1]–E[Di|Zi=0];equivalently,theLATEcanberecoveredfromaninstrumental

variablesregressionusingZiasaninstrumentforDi.

TheLATEmaydifferfromtheATEorevenfromtheTOT.Forexample,in

manysettingsonewouldexpectthatpeoplewhowillreceivethelargestbenefits

fromtreatmenttomakedisproportionateeffortstoobtainit,evenifassignedtothe

controlgroup;inthiscase,theTOTwillexceedtheLATE.Unfortunately,the

compliersarenotalwaysthepopulationofprimaryinterest.Furtherstructure,or

successfulrandomizationofDiitself,isrequiredtoidentifytheATEorTOT.

c. Limitationsoftheexperimentalparadigm

Thebasicexperimentalparadigmisinvaluableforitsabilitytoresolvethe

fundamentalproblemofcausalinference,byensuringthatestimatedprogram

effectsarenotconfoundedbyselectionintotreatment.Butitcannotsolveall

identificationproblemsfacedbyprogramevaluators,noranswerallquestions

posedbylaboreconomistsseekingtounderstandtheworkingsofthelabormarket.

Intheremainderofthissection,wewillbrieflyintroducesix(partiallyoverlapping)

designissuesthatcommonlyariseinlabormarketexperiments.Ineachcase,

identifyingtheeffectsofinterestmayrequiremovingbeyondthetreatment-control

22

contrastinoutcomesfromasimplerandomizedexperiment.Wediscusseachin

moredetailinSectionIV,wherewealsodiscusspotentialsolutionstoeach.

i. SpilloverEffectsandtheStableUnitTreatmentValueAssumption

Theabovebriefoverviewoftheeconometricsofexperimentsglossesoveran

importantassumption,knownasthe“stableunittreatmentvalueassumption,”or

SUTVA(Angrist,Imbens,andRubin1996;AtheyandImbens,thisvolume).

Intuitively,thisassumptionstatesthattheoutcomeofindividualiisunaffectedby

thetreatmentstatusofeachoftheotherstudyparticipants.Withoutthis

assumption,eachindividualhasnottwobut2Npotentialoutcomes,makinganalysis

intractable.Formanyprogramevaluations,SUTVAisinnocuous.Butinothercases

itcanbequiterestrictive.Forexample,theprovisionofjobsearchassistanceto

someindividualsmaycreate“congestion”inthelabormarket,reducingthejob-

findingratesofothersparticipatinginthatmarket.ThisisaviolationofSUTVA,and

willleadasimplerandomizedtrialtooverstatethetotaleffectofjobsearch

assistance.AnotherpotentialviolationofSUTVAoccursifmembersofthetreatment

groupinteractwitheachotherorwiththecontrolgroupinawaythatdilutesthe

treatmentdifferencebetweenthem–forexample,ifthetreatmentinvolves

informationprovisionbuttreatedindividualspassthatinformationontothe

controls.

ii. Endogenouslyobservedoutcomes

Inmanylabormarketexperiments,someoutcomesofinterestareobserved

onlyforasubsetofindividuals.Forexample,weeklyhoursofwork(laborsupply),

23

hourlywages,jobcharacteristics,careeradvancement,andretentiononthejobare

observedonlyforthosewhoareabletofindjobs,notforthosewhoare

unemployed.Evenidealexperimentswithperfectcompliancemaynotidentifythe

causaleffectsofinterestontheseoutcomes.

iii. SiteandGroupEffects

Anotherlargeclassoflimitationsinexperimentshastodowithgeneralizing

beyondtheexperimentalsample.Extrapolationstootherprograms,othersamples,

orothertreatmentregimescanbehazardous.Wewilldiscussinthispaperthree

broadclassesofexternalvalidityissues.

Oneclasshastodowithvariationsinthetreatmentonofferacrossprogram

locations.Inmanyprograms,thetreatmentisnothomogeneousacrosslocations;in

othercases,thetreatmentmaybehomogeneousbutoutcomedistributionsvary.In

eithercase,onemightbeinterestedinidentifyinghowtreatmenteffectsvaryacross

locations.

Thesecondclassderivesfromobserveddifferencesbetweenthepopulation

ofinterestandthatincludedintheexperimentalsample–onemightwantto

understandaprogram’seffectonapopulationthatdiffersinobservablewaysfrom

thatrepresentedintheexperimentalsample,oronasubpopulationotherthanthe

experimentalcompliers.

iv. TreatmentEffectHeterogeneityandExternalValidity

Thethirdclassofexternalvalidityissuesarisesfromunobserveddifferences

inindividualtreatmenteffects–whentheeffectofthetreatmentvariesacross

24

individualsinwaysthatarenotcapturedbyobservedparticipantcharacteristics,

andwhentheparametersofinterestextendbeyondtheaveragetreatmenteffectin

thepopulationfromwhichtheexperimentalsampleisdrawn.Thiscanoccurwhen,

forexample,theexperimentalcompliershareisnotexpectedtomatchthetake-up

ratewhentheprogramisofferedmoregenerally,orwhenoneexpectstoofferthe

programtoapopulationthatmaydifferinitstreatmenteffectdistributionfromthe

experimentalpopulation.Whileconceptuallysimilartodifferencesalongobserved

characteristics,theeconometricsbehindaddressingunobserveddifferencesin

treatmenteffectsissufficientlycomplexandself-containedthatwediscussit

separately.

v. HiddenTreatments

Interpretingestimatedprogrameffectsandextrapolatingtoothersettings

canbecomplexeveninthecaseofuniformtreatmentsanduniformpopulations.For

example,ifnon-compliershaveaccesstoalternativestotheprogramunderstudy

(e.g.,tocoursesofferedbyalternativejobtrainingproviders),thiswillleadto

variationintreatmenteffectsevenwithouttreatmenteffectheterogeneityornon-

complianceintreatmentassignmentinthestandardsense.Thealternative

treatmentsareoften“hidden,”asadministrativedataontheprogramunderstudy

willnotrevealwhetherparticipantshavereceivedalternativeselsewhere.Inthis

case,theexperimentalimpactidentifiesthetreatment’seffectrelativetoapoorly

specifiedalternativethatmaynotdifferdramatically,andmaybeapoorguideto

theprogram’svaluerelativetonotreatment.Inmulti-sitestudies,differentialtake-

upofsuchhiddentreatmentsbythecontrolgroupmaycreatetheappearanceof

25

treatmenteffectheterogeneityacrosssitesandhinderextrapolationtoother

settings.

vi. MechanismsandMultipleTreatments

Inmanyinstances,weareinterestedinunderstandingthemechanism

generatingaparticulartreatmenteffect.Insomecases,theeffectsofseparate

mechanismsareofinherentinterest.Incomplexexperimentswithmultiple

treatments,itisimportanttounderstandwhichtreatmentswereparticularly

effective,andwhy.Forexample,manyjobtrainingprogramsincludejobsearch

assistance,andviceversa.Inothercases,understandingthemechanismsiscrucial

inextrapolatingfromtheparticularexperimentalsettingtoothersituations.For

example,intheCanadianSelf-SufficiencyProgram(SSP)workershavetofirst

establisheligibilitytothenparticipateawagesubsidyprogram,creating

endogenousselectionthatmakesitdifficulttointerprethowthesubsidyprogram

affectslaborsupply(CardandHyslop2005).Withoutadditionalinformationor

additionalstructure,multiplemechanismsarenotseparatelyidentified,leadingto

potentialseriouslimitationsinunderstandingoftheprogramandinexternal

validity.

d. Quasi-experimentalandStructuralResearchDesigns

Itisnotalwayspossibletouseatruerandomizedexperimenttoevaluatea

programormechanismofinterest,duetooperational,financial,orethical

constraints.Quasi-experimentalstudiesrelyonaspectsoftheprogramorpolicy

variationasasourceofplausiblyas-good-as-randomvariationintreatment

26

assignment–examplesincluderegressiondiscontinuitydesigns,regressionkink

designs,anddifference-in-differences(seeAngristandKrueger1999).Thesecanbe

usefulalternativeswhentrueexperimentsareinfeasibleorsimplynotavailable.

Whenthequasi-experimentalvariationisasgoodasrandomlyassigned,thevarious

quasi-experimentaldesignscanrecovertreatmenteffectsjustascanexperiments.

Buteveniftheassumptionsgoverningassignmentarecorrect,quasi-

experimentaldesignsgenerallysolveonlytheassignmentproblem,anddonot

necessarilyaddresstheadditionalissuesdiscussedabove.Thesameistruefor

selection-on-observablesestimators(e.g.,matchingestimators):The

“unconfoundedness”assumptioneliminatestheselectionproblem,ifitholds,but

doesnothingtoaddressotherdesignissues.

Incontrast,structuralapproachesthatexplicitlyspecifyallaspectsofthe

choiceproblemandresultingoutcomescaninprincipleresolvebothassignment

andotherdesignissuessimultaneously.However,thisapproachhingesonthe

modelbeingcorrectlyspecified,andhencemaycomeatasubstantialcostto

internalvalidity.

III. Amorethoroughoverviewoflabormarketsocialexperiments

ItisnoaccidentthatwediscussdesignissuesofRCTsinthecontextofsocial

experimentsinthelabormarket,sincemanyofthemajordesignissuesdiscussedin

SectionIIariseintheevaluationofimportantlabormarketprograms.Inthissection

wereviewsomeofthemaincharacteristicsofexistingsocialexperimentsinlabor

economicsinlightofthesedesignissues.Wedistinguishthreebroadsubstantive

27

topicsthathavebeenstudiedextensivelyviasocialexperiments:Laborsupply,

particularlyoflow-incomefamilies,welfarerecipients,andunemployment

insurancerecipients;jobtrainingandskilldevelopment;andjobsearch.Inthis

Section,wediscusseachinturn.Foramoredetaileddiscussionoftheexperiments

wementionhere,wereferthereadertooursummarytables,andexcellent

overviewsprovidedelsewhere.10

a. LaborSupplyExperiments

Onecanbroadlycategorizesocialexperimentsprovidingincentivesto

increaselaborsupplyintothreegroups,followingtheirprogramstructure,target

group,andtimeperiod:TheIncomeMaintenanceExperimentsinthelate1960sand

early1970s;welfarereformexperimentsinthelate1980sthroughthemid-1990s;

andreemploymentsubsidyexperiments,whichspanalongertimeperiod.

TheIncomeMaintenanceExperiments

AfirstwaveofexperimentsweretheIncomeMaintenanceExperiments

(IME)alreadydiscussedinSectionII,whichtreatedlow-incomehouseholdswith

variouscombinationsoflump-sumtransfersandtaxesonearnings.Byrandomly

assigningtreatmentandcontrolgroupstomultipletreatmentarmswithvarying

combinationoftaxratesandsubsidies,andbyseparatelytargetinggroupsof

differentincomelevels,theexperimentsallowedtracingoutlaborsupplyresponses

10SeeamongothersGreenbergandShroder(2004),Heckman,Lalonde,andSmith(1999),Meyer(1995).OuroverviewfocusesalmostexclusivelyonU.S.experiments.Foranoverviewofactivelabormarketpolicyevaluations,drawinglargelyonEuropeanevidence,seeCard,Kluve,andWeber(2010).

28

indifferentpartsofthebudgetconstraintandundervaryingfinancialconditions.

Therewerefoursuchexperiments,initiatedbetween1968and1971,inNewJersey,

Seattle-Denver,Gary(IN),andinruralareas.Table1providesdetailedinformation

abouttheseexperiments.Whilethesamplesizesweremoderatebylaterstandards,

thetotalcostwassubstantialcomparedtomostrandomizedevaluationoflabor

supplyincentivesthatwouldfollow.Thisisinimportantpartbecausetheprogram–

thepaymentsthemselves–wasexpensiveonaper-participantbasis.Complex,

stratifiedexperimentaldesignswereusedineffortstominimizethesecosts,but

evenwiththesethestudiesweremajorinvestments.

Acrosseachoftheincomemaintenancestudiesandvariouscomparison

groups(e.g.,husbands,wives,andsinglefemalehouseholdheads),laborsupply

resultswerefairlyconsistent:Thecombinationofalump-sumtransferanda

positivetaxratereducedparticipants’earnings(i.e.,laborsupply),bymoresowhen

thetransferandtaxratewerelarger.Thisreflectsacombinationofincomeand

substitutioneffects;Robins(1985)combinesthevariousstudiesandusescontrasts

amongthedifferenttreatmentarmstoseparatelyidentifytheincomeand

substitutionelasticitiesoflaborsupply.Heconcludesthattheseelasticitieswere

fairlystableacrossstudies,butfairlysmall:Thesubstitutionelasticitywasunder0.1

forhusbands,justabove0.1forsinglefemaleheads,andmorevariablebut

averaging0.17forwives.Incomeelasticitieswerelessconsistent,butcentered

around-0.1.

Inretrospect,theseexperimentsencounteredanumberofthedesignissues

thatweidentifiedinSectionIIanddiscussatgreaterlengthbelow.Forexample,

29

becauseofthehighattritionrates,whichasAshenfelterandPlant(1990)notewere

differentialacrosstreatmentgroups,theyalsocanbeseenasanexampleofthe

endogenouslyobservedoutcomesproblem.Similarly,withoutadditional

assumptionsitisimpossibletoestimatetheeffectoftheseprogramsonhours

workedorwages.Interestingly,incontrasttomostrandomizedevaluationsthat

followed,theywereprimarilyfocusedonidentifyingthemechanisms–incomevs.

substitutioneffects–behindanylaborsupplyresponses,ratherthanthesimple

treatmenteffectofanexistingprogram.Thismotivatedtheuseofalargenumberof

treatmentarms,anoptionwediscussbelowasonewayofaddressingquestions

aboutmechanisms.

WelfareReformExperiments

Asecondwaveofsocialexperimentsrelatedtolaborsupplywasinitiated

betweenthelate-1980sandthemid-1990s,andevaluatedtheeffectofemployment

incentivesforwelfarerecipients.WhiletheIMEexperimentswerefundedalmost

exclusivelybythefederalgovernment,theselaterevaluationsconcernedstate-level

programsandwerefundedmostlyatthestatelevel.11Incontrasttotherelatively

straightforwardstructureofthenegativeincometaxtreatments,thesewereusually

randomizedevaluationsofentire,complexprograms,oftendesignedas

replacementsfortraditionalAFDC,thatincludedcomponentsdesignedto

strengthenworkincentivesalongwithothers(e.g.,childcareorjobsearch

assistance)designedtoreducebarrierstowork.

11Foradetailedhistoricalaccount,seethechapterbyJudithGueroninthisvolume.

30

WehaveidentifiedwelfareRCTsinatleast13states.Table1includesa

selectionoffoursocialexperimentsonthistopic,implementedinCalifornia,

Connecticut,Florida,andMinnesota,thoughthereweremanymorenotlistedhere.

Acommoncomponenttomostnewprograms(experimentaltreatments)wasthe

introductionoflifetimetime-limitsofwelfarereceiptandincreasesinearnings

disregards,botheventualcomponentsofthe1996federalwelfarereform–priorto

thisreform,implementationofsuchchangesrequiredawaiverfromtheU.S.

DepartmentofHealthandHumanServices,andthiswasoftenconditionedonan

experimentalevaluation.Theexactnatureofboththenewprogramsandthe

traditionalwelfarebenefitvariedbystate.Otherprogramfeaturesvariedwidelyas

well,includingjobsearchassistance,accesstochildcare,changesincase

management,andprovisionofjobtraining.

TwoexamplestowhichwewillrefertolaterareConnecticut’sJobsFirstand

Florida’sFamilyTransitionProgram.Inbothcases,controlgroupmembersfaceda

welfarebenefitschedulethathadnotimelimitsandhighimplicittaxeson

working.12JobsFirstandtheFamilyTransitionProgrameachintroducedtimelimits

forwelfarereceiptandbenefitscheduleswithlowerimplicittaxrates.UnderJobs

First,eligiblewelfarerecipientssawnoreductionintheirbenefitswhileworking

untilearningshitthefederalpovertyline.UndertheFamilyTransitionProgram,a

workingwelfarerecipientcouldkeep$200amonth,plus50%ofallearningsabove

12InConnecticut,welfarerecipientswereeligibleforafixedearningsdisregardof$120forthetwelvemonthsfollowingthefirstmonthofemploymentwhileonassistanceand$90afterwards.Recipientswerealsoeligibleforaproportionaldisregardofearningsabove$120($90):51%forthefourmonthsfollowingthefirstmonthofemploymentand27%afterwards.InFlorida,afterthefirstfourmonthsofwork,themarginaltaxrateonearningsforAFDCrecipientswas100%iftheyearnedover$90permonth.

31

$200.Bothprogramsalsomodifiedotherwelfareprogramfeatures,including

enhancedenforcementofworkrequirements,changingthedurationofaccessto

Medicaidbenefits,settingassetlimitsforwelfarereceipt,andprovidingchildcare

assistance,amongothers.

Therandomizedevaluationofthetwoprogramscapturedthecombined

effectsofallofthesechangesonemploymentandearnings.Eachprogramledto

higherearningsandhighertotalincomes,inclusiveofwelfarepayments,inthe

treatmentgroup,thoughineachcasethiseffectdiminishedovertime.Total

governmentalcostswerehigherfortheConnecticuttreatmentgroupthanfor

controls,butthereversewastrueinFlorida.Animportantcaveatisthatthese

resultslargelyreflecttheperiodbeforetimelimitsbound.

Inmanyofthewelfare-to-workexperiments,keyoutcomesofinterest

includedhoursofworkamongthosewhoareemployedandwagesorearnings.

Neitheroftheseisobservedforthosewhoarenotemployed.Thus,althoughmany

studiesreportexperimentaleffectsonendogenouslyobservedoutcomes,theseare

understoodtosufferfromseriousselectionproblems.Anotherissuetotakeinto

accountininterpretingtheseexperimentsisthepossibilityofspillovereffects.

Theseweretypicallynotsmallpilotstudiesbutinvolvedbroadchangestowelfare

rules,sometimesappliedtoallprogramparticipantsexceptforahold-outcontrol

group.

Anothermajorquestionregardingwelfare-to-workprogramsconcerns

heterogeneityintreatmenteffects.Onemightimaginethatthereisasubpopulation

ofrecipientswhoareresponsivetoworkincentivesandanothergroupofhardcases

32

whoaremuchlessresponsive.Theaveragetreatmenteffectsthatcanbeestimated

fromtheseexperimentsmightsubstantiallyoverstatetheemployabilityofthelatter

participants.

ReemploymentSubsidyExperiments

Athirdbroadgroupoflaborsupply-relatedexperimentsevaluateddirect

reemploymentsubsidies.Onesetofsuchprogramshadincentivesstructuredlikea

negativeincometaxandweretargetedtowelfarerecipientsorlow-income

individuals,sometimesaspartofthesameAFDCreformsdiscussedabove.These

tookplacemostlyinthemid-tolate-1990s,andincludedtheCanadianSelf-

SufficiencyProgram(SSP),Minnesota’sFamilyInvestmentProgram(FIP),and

Wisconsin’sNewHopeProject.TheseRCTscanbeseenasevaluationsofwelfare-

likeprograms,butincludedsubsidiesthatwereconditionalonsustainingacertain

amountofemployment.Notsurprisingly,theseprogramsgenerallyledtoincreased

earningsamongtreatmentgroupparticipants(thoughFIPwasanexception);

differentstudiesvariedinwhethertheadditionalincomeofparticipantswaslarger

orsmallerthantheextrawelfarecostsbornebythegovernment.

Anothersetofsuchprogramswereschemesthatpaidlump-sumsubsidies

conditionalonemployment–effectively,bonusesforfindingwork.Theseinclude

thewell-knownreemploymentbonusexperimentstargetedatunemployedworkers

receivingunemploymentinsuranceinIllinois,Pennsylvania,andWashingtonState

inthemid-1980s.Thesestudiesfoundthateligibilityforarelativelylarge

reemploymentbonusledtoshorterunemploymentinsurancespells,withno

33

detectableimpactonthequalityofthejobobtained,butthattheeffectswere

relativelysmallandthustheprogramswerenotcosteffective.

Morerecently,abonusforwelfarerecipientswhofoundajobandwho

remainedreemployedforacertaintimewasevaluatedinthecontextofTexas’

EmploymentRetentionandAdvancement(ERA)projectintheearly2000s(Dorsett

etal.2013).TheTexasevaluationwaspartofalarge-scalerandomizedevaluation

of12differentservicecombinationsindifferentU.S.citiesfrom2000to2004under

theERAprojectumbrella(HamiltonandScrivener2012).ThemainfocusofERA

wastoexpandworkforceservicestorecentlyreemployedwelfarerecipientsorlow-

wageworkerstomaintainsuccessfullaborforceattachment(thoughthreesites,

includingTexas,combinedpre-andpost-employmentassistance).Theevaluation

testedabroadrangeofservices,withatbestmixedresultsregardingtheeffectof

post-employmentservicestested.

Animportantfeatureofseveraloftheseemploymentsubsidyprogramswas

thatpotentialrecipientshadtobecomeeligibleforthesubsidy,usuallybyworkinga

minimumamountofhours.Hence,whilethemaingoaloftheprogramswastohelp

workersbuildattachmenttothelaborforce,effectsofthesubsidy(asdistinctfrom

thesubsidyoffer)onthedurationofemploymentcouldbeestimatedonlyforthose

whofoundjobsinthefirstplace,asubsamplethatwasdifferentiallyselectedinthe

treatmentandcontrolgroups.CardandHyslop(2005)

refertothisasan‘eligibilityeffect’;inourearliertaxonomyofdesign

challenges,thiscanbeseenasacasewherethemechanismsunderlyingthe

34

treatmenteffectareofprimaryinterest.Underanyname,itcomplicatesthe

interpretationoftheoutcomesofasimpleRCT.

Overall,randomizedstudiesofarangeoflaborsupplyincentiveprograms

havefoundlaborsupplyresponsestochangesinimplicitorexplicitfinancial

incentivesaspredictedbytheory.However,abroadthemeemergesthat

employmenteffectshavemostlybeenshort-lived,andeffectsontotalparticipant

incomeinconsistent.Achallengeininterpretingthesestudieshasbeenthattypically

anumberoftreatmentswerevariedsimultaneously,includingimplicittaxratesand

lump-sumtransfers,trainingprograms,jobsearchassistance,enforcementand/or

timelimits.Hence,extrapolatingfromthesefindingstonewprogramsproviding

differentcombinationsoftreatmentsisdifficultwithoutunderstandingthe

underlyingbehavioralresponses,whichtypicallyrequiresadditionalassumptions.

b. Trainingexperiments

From1964totoday,wecountover50RCTsthatevaluatejobtraining

programsofvariousforms.Theseincludelarge-scaleevaluationsconductedatthe

nationallevel,state-levelevaluations,andevaluationsofprogramsatthelocallevel.

Theprogramsevaluatedvariedsubstantiallyinthetypeoftraining,whichranged

fromvocationalandgeneralclassroombasedtrainingofdifferentdurationstoon-

the-jobtrainingbyactualemployers.Mosttrainingprogramswerecomplemented

bysomekindofjobsearchassistance,butinthestudieswereviewherethiswasnot

theemphasis.Table2providesanoverviewofaselectedgroupoftheseRCTs.

35

Trainingprogramsarelesseasilyclassifiedthanlaborsupplyprograms.

Whilethefirstjobtrainingsocialexperimentofwhichweareawarefocusedonlaid

offworkers(theGeneralEducationinManpowerTrainingexperiment,begunin

1964),thevastmajorityoftrainingprogramsaretargetedtowelfarerecipients,to

low-incomeindividualsgenerally,ortolow-incomeyouth.Moreover,whileonecan

broadlydistinguishphasesofexperimentalevaluationparalleltothepatternsinthe

evaluationofwelfareprogramsoutlinedabove,randomizedevaluationsoftraining

programsoccurredmoreevenlyfromthe1980stotoday.Itisalsohardertodiscern

commonpatternsinthetypesoftrainingprovidedorprogramsevaluated.

Thefirstlarge-scaleevaluationofamixofon-the-jobexperienceand

supervisionforhard-to-employindividualswastheNationalSupportedWork

Demonstration(NSWD),whichranfrom1975to1980.TheNSWDwasalargeand

expensivesocialexperimentimplementedbytheU.S.atthenationallevel,butdid

notevaluateanestablishedtrainingprogram.Rather,theNSWDreliedonlocalnon-

profitstoorganizeaprograminwhichtreatmentparticipantswereplacedinteams

ofupto10participantsworkingunderaforeman,whoalsoservedasacounselor

andlaterprovidedjobsearchassistance,onsmall-scaleprojects,typicallyin

construction,lightmanufacturing,orsocialserviceprovision.Participantsreceived

asmuchasoneyearofworkexperience,underconditionsofincreasingdemands,

closesupervision,andworkinassociationwithacrewofpeers.Thestudytargeted

fourgroupsofworkers:womenthathadbeenonAFDCforatleast30months;ex-

addicts;exoffenders;andyounghigh-schooldropouts.Ittookplaceat10sites,and

36

ateachsitesenrolleeswereselectedrandomlyfromagroupofvolunteers.13

ParticipationhadlargepositiveeffectsonAFDCrecipientsandsmallerpositive

effectsonex-addicts,butbenefitsforothergroupsweresmallerandgenerally

statisticallyinsignificant.

ThedatausedtoevaluateNSWDcamefromaseriesoffollow-upsurveys.14

Attritionwasanissuehere:After27months,only72%(68%)ofthetreatment

(control)groupsoftheNSWDcompletedinterviews.AsintheNITstudies,thiscan

beseenasavariantoftheendogenouslyobservedoutcomesproblem.

TheNSWDstudywasfollowedbyarangeofevaluationsofstate-level

programsintheearly-tomid-1980s.Theseweretargetedalmostexclusivelyat

welfarerecipients,andlargelyfinancedbythefederalgovernment.These

evaluationscontinued,withgreaterinvolvementofstategovernments,throughthe

late1980sandmid-1990s.WhilemanyoftheseRCTswererelativelysmall,some

weresubstantial.ExamplesincludetheCaliforniaGAINandOhioJOBSprogram

evaluations,beginningin1988and1989,respectively.Detailedcharacteristicsof

someoftheseevaluationsareshowninTable2.TheCaliforniaprogram,whichwas

mandatoryforwelfarerecipients,includedjobsearchassistance,basiceducation,

andskillstraining.Ithadlargepositiveeffectsonearningsandnegativeeffectson

welfarereceipt,particularlyforsingleparents.EffectswerelargestinRiverside

County,whereadministratorsemphasizedjobplacementasthecentralgoal.

13TheManpowerDemonstrationResearchCorporation(MDRC)wasfoundedin1974tomanagetheNSWDstudy.Foradetailedsummaryoftheprogramandfindings,seeManpowerDemonstrationResearchCorporationBoardofDirectors(1980).14TheNSWDhasbeenexaminedbyanextensiveliterature,includingLalonde(1986),DehejiaandWahba(2002),andSmithandTodd(2005).

37

However,areanalysisofthelong-termeffectsofGAINbyHotzetal.(2006)found

thattheeffectsinRiversideCountywereshort-livedrelativetothoseinLosAngeles

County,whichfocusedmoreonhumancapitaldevelopmentandwhereeffectswere

initiallysmallerbutroseovertime.15TheOhioprogramwassimilarindesignbut

encounteredmoreproblemsinimplementation,andyieldedsmallereffects.

Anexceptiontothetrendtowardsevaluationofstate-levelorlocaltraining

programswasthelarge-scale,nationalevaluationofthemainfederaltraining

programaimedatlow-incomeadultsanddisadvantagedyouth–theNationalJob

TrainingPartnershipAct(JTPA)Study.TheJTPAwasafederalprogramenactedin

1982,andwasadministeredatthestateandlocallevel.JTPAtrainingprograms

providedemploymenttrainingforspecificoccupationsandservices,suchasjob

searchassistanceandremedialeducation,toroughlyonemillioneconomically

disadvantagedindividualsperyear.Whiletheprogramandsomeserviceswere

administereddirectlybyJTPAstaff,trainingwasprovidedthroughlocalservice

providers,suchasvocational-technicalhighschools,communitycolleges,

proprietaryschools,andcommunity-basedorganizations.Traininglastedthreeto

fourmonths,onaverage,butdurationvariedwidelyacrossindividualsandprogram

sites.

Congress,inpartrespondingtolimitationsofnon-experimentalevaluations

ofthepredecessorprogramtoJTPA,theComprehensiveEmploymentandTraining

Act,mandatedarandomizedevaluationofJTPAin1986.Controlsubjectswere

15Hotzetal.(2006)alsopointoutthatthetreatmentgroupwasselecteddifferentlybetweenthefourGAINsites,possiblycontributingtotheestimated‘site’effects.Forexample,theRiversideCountyRCTsampleincludedasmallerfractionofthemoredisadvantagedwelfarerecipients.

38

excludedfromobtainingJTPAservicesfor18months.Toassessshort-andmedium-

termprogramimpactsonemploymentandearnings,theevaluationbothcollected

surveydataanddrewfromadministrativestate-levelrecords.16Theevaluationtook

placeat16JTPAprogramsites(socalledServiceDeliveryAreas,SDAs).

ParticipationbySDAsintheevaluationwasvoluntary,andsomeSDAsobjectedto

randomlyexcludingeligibleapplicants.TheparticipatingSDAsdidnotdifferfrom

othersinobservablecharacteristics(e.g.,Bloometal.1997),butmayhavediffered

inunobservedwaysthatwouldberelevanttoanextrapolationtotheoveralleffect

ofthenationalprogram.

AnexplicitgoaloftheJTPAevaluationwastoobtaindifferentialimpactsfora

widerangeoftargetgroups,includingadultwomen,adultmen,femaleyouths,and

maleyouthwithandwithoutanarrestrecord.Adultwomensawthelargest

earningsgains,followedbyadultmen;effectsonyouthweresmallerandgenerally

notsignificant(thoughthereweresignificanteffectsonattainmentofhighschool

diplomasforbothadultwomenandfemaleyouth).Inadditiontodemographic

subgroupanalyses,heterogeneityinprogramimpactswasestimatedalongseveral

otherdimensions,includingJTPAservicesrecommendedbyprogramintakestaff,

ethnicityandpriorlabormarketexperience.Whilethesubgroupeffectsofinterest

werelargelypre-specified,thisdoesnotfullyeliminatemultiple-comparisons

problems,particularlywhenthenumberofpre-specifiedcomparisonsissolarge,

andthusthereisanenhancedriskofafalsepositive.

16SeeBelletal.(1994)andBloometal.(1997)fordescriptionsoftheJTPAevaluation.ThereisasubstantialliteratureontheevaluationoftheJTPAprogram.SeeHeckman,Lalonde,andSmith(1999)forasummary.

39

Jobtrainingevaluationsslowedafterwelfarereforminthemid-1990s,then

begantopickupagainintheearly2000s.Someevaluationsinthisperiodfocused

onsector-specificemployment,suchastheSectoralEmploymentImpactStudy(e.g.,

Maguireetal.(2010)andevaluationsofsimilarsmaller,localprograms.17There

wasalsoarandomizedevaluationofcombinedtrainingandjobplacementservices

undertheWorkforceInvestmentAct(WIA)from2005to2015(theWork

AdvancementandSupportCenterDemonstration),andmorerecentlyastudyofthe

returnfromcommunitycollegeattendanceundertheTradeAdjustmentAssistance

CommunityCollegeandCareerTraining(TAACCCT)GrantsProgram.

Adistinctbroadstrandofrandomizedevaluationsoftrainingprograms

focusesonlow-incomeyouths.Again,theseprogramsofferabroadrangeof

differenttypesoftrainingaugmentedbyvaryingcombinationsofsupportservices.

Socialexperimentsinthisareahaveincludedarangeoffederallyandnationally

fundedevaluationsrangingfromtheearly1980stothemid-1990sthatculminated

intheNationalJobsCorpsStudy,describedbelow.Asinotherjobtrainingstudies,

thepaceofexperimentationslowedinthemid-1990s,butseveralnewstudieswere

undertakeninthemid-2000s.Somerandomizedevaluations,suchasNewYork

City’sSummerYouthEmploymentProgram(strictly,anaturalexperiment,as

randomizationispartoftherationingprocessandnotadecisionmadeinorderto

facilitateanevaluation),areongoing.Again,thebroadtrendwasfromafederal

17Theseinclude,amongothers,theGeorgiaWorksprograms,ProjectQuestinSanAntonio,theWisconsinRegionalTrainingPartnershipinMilwaukee,PerScholasinNewYorkCity,andtheJewishVocationalServiceinBoston.

40

monopolyonfundingtowardsagreaterinvolvementoflocalandprivatefunding

sources.

Thelargestandperhapsbestknownstudyofatrainingprogramfor

disadvantagedyouthsistheNationalJobsCorpsStudy.TheJobCorpswascreatedin

1964aspartoftheWaronPoverty,andcurrentlyoperatesundertheprovisionsof

theWorkforceInnovationandOpportunityActof2013,whichconsolidated

programsauthorizedundertheWorkforceInvestmentActof1998.JobCorps

servicesaregearedtowardseconomicallydisadvantagedyouthsaged16to24.Core

servicesaredeliveredbyaJobCorpscenter,usuallyresidential,andinclude

vocationaltraining,academiceducation,residentialliving,healthcare,andawide

rangeofotherservices,includingcounseling,socialskillstraining,healtheducation,

andrecreation.18Aboutaquarteroftheover100centersareoperateddirectlyby

theU.S.government,withtheremainderoperatedbyprivatecontractors.The

averagedurationoftheprogramiseightmonths,thoughbyitsphilosophythe

durationrespondstotheparticipant’sneedsandactualdurationvarieswidely.For

sixmonthsaftertheyouthsleavetheprogram,placementagencieshelpparticipants

findjobsorpursueadditionaltraining.

TheJobCorpsevaluationwasbasedonanexperimentaldesigninwhich,

withafewexceptions,allyouthsnationwidewhoappliedtoJobCorpsinthe48

18Themajorityoftrainingisvocational,andcurriculaweredevelopedwithinputfrombusinessandlabororganizationsandemphasizetheachievementofspecificcompetenciesnecessarytoworkinatrade.Academiceducationaimstoalleviatedeficitsinreading,math,andwritingskillsandtoprovideaGEDcertificate.AlthoughmostJobCorpsservicesareresidential,therehavebeennonresidentialparticipants(mostlywomenwithchildren).Therehavebeeneffortstoevaluatenon-residentialJobCorpsservices(e.g.,GreenbergandShroder2004,Schochetetal.2008).

41

contiguousstatesbetweenNovember1994andDecember1996andwerefoundto

beeligiblewererandomlyassignedtoeitheraprogramgrouporacontrolgroup.

ProgramgroupmemberswereallowedtoenrollinJobCorps;controlgroup

memberswereexcludedforthreeyearsafterrandomassignment.Thecomparisons

ofprogramandcontrolgroupoutcomesrepresenttheeffectsofJobCorpsrelativeto

otheravailableprogramsthatthestudypopulationwouldenrollinifJobCorpswere

notanoption.19Thecontrolandtreatmentgroupsweretrackedwithaseriesof

interviewsimmediatelyafterrandomizationandcontinuing12,30,and48months

afterrandomization.

TheevaluationofJobCorpsfollowedtheoutcomesofover15,000

experimentalsubjectsforuptoeightyearsusingsurveyandadministrativedata.

Theeffectoftrainingonearningsbecamegraduallypositiveasindividuals

graduatedfromtheprogram,andthenremainedstatisticallysignificantlydifferent

fromthecontrolgroupforuptofouryearsafterwards.Atthesametime,

governmenttransfersandcrimeratesfell(e.g.,Schochetetal.2008).Therewas

substantialheterogeneityinoutcomes–theeffectswerestrongestforthose20-24

yearoldatthetimeoftraining,andweakestforHispanics.

Aconcernwiththesefindingswasthattheoveralllevelofearningsandthe

sizeofthetreatmenteffectswerequitedifferentintheadministrativedatathanin

thesurveydata.Whilesurveydataaremoretobeaffectedbyendogenousattrition,

administrativedataarenotapanacea:Theyexcludeunder-the-tableemployment,

19Ofcourse,ifJobCorpsdidnotexist,theecosystemofotheravailableprogramswouldpresumablychange.ThisisformallyaSUTVAviolation,andimpliesthatcontrolgroupmeanoutcomesmaynotequalwhatwouldbeseenintheabsenceoftheprogram.

42

whichmaybecommonintheJobCorpspopulation.20Theyalsocannotaddressthe

problemthatwagesareobservedonlyforthosewhoareemployed,itselfan

intermediateoutcomeoftheprogram(e.g.,Lee2009)

AnimportantquestionregardingJobCorpsistherelativeperformanceofthe

differentJobCorpscenters,whichoperateindifferentlabormarketsandare

(sometimes)runbycontractorsratherthandirectlybythegovernment.Schochet

andBurghardt(2008)usetheJobCorpsevaluationdatatoestimateseparate

treatmenteffectsbysite,findingthatthesearenotstronglycorrelatedwiththenon-

experimentalmeasuresthathavebeenusedtoassesssiteperformance.

AfinalissueintheJobCorpsevaluation,nottoourknowledgeaddressedin

theliterature,isthattheprogrammaybelargerelativetotherelevantlabor

markets,creatingthepossibilityofimportantspilloversfromtreatedtocontrol

studyparticipants.

Afinal,smallercategoryoflarge-scalesocialexperimentsoftraining

programsfocusedspecificallyonunemployed(displaced)workers.Aswewill

discussbelow,someoftheseRCTsevaluatedprogramsprovidingabroadarrayof

reemploymentservicesthatalsoincludedsomedegreeoftraining.Thisraisesa

similarissuetowhatwehighlightedabovewithwelfareexperiments–experimental

evaluationsgenerallyidentifythe“blackbox”effectoftheoverallprograms,butnot

thecomponentsormechanismsresponsibleforthoseeffects.

20KornfeldandBloom(1999)showthatthisisthecaseforparticipantsintheJobTrainingPartnershipAct(JTPA)evaluation.

43

TheIndividualTrainingAccount(ITA)Experimentrunningfrom2001to

2005directlyevaluateddifferentmodesoftrainingprovisionprescribedbythe

1998WorkforceInvestmentAct.WIAallowedlocalagenciestoimposedifferent

degreesofcounselingandsupervisionofworkers’trainingchoices,andtheITA

experimentevaluatedtheeffectofthesechoicesonactualtrainingreceivedand

labormarketoutcomes.Effectively,theITAexperimentcomparedthreeservice

models.GuidedChoiceandMaximumChoicehadstandardizedsubsidiesfor

training,buttheformerrequiredcounselingbyacaseworkerwhilethelatterhad

nocounselingrequirement.Athirdmodel,StructuredChoice,waseffectivelylike

GuidedChoicebutofferedindividualized,andtypicallymoregenerous,training

awards.21

Thefindingsindicatedthateithermoregenerousawards(StructuredChoice)

orlesscounseling(MaximumChoice)ledtoahigherincidenceoftraining(Perez-

Johnsonetal.2011).EarningsincreasedforworkersinStructuredChoicerelativeto

GuidedChoicefiveyearsafterthetreatment.(Earningseffectswerehigherbutnot

statisticallydifferentforMaximumChoicerelativetoGuidedChoiceortoacontrol

group.)WhileStructuredChoicewasestimatedtobecostefficienttosociety,itwas

moreexpensivefortheworkforcesystem,andmostagenciesadoptedGuided

Choiceastheleadingmodel.Morerecently,anongoingexperiment(theWIAAdult

andDislocatedWorkerProgramsGoldStandardEvaluation,discussedbelow)

evaluatesdirectlytheintensiveandtrainingservicesprovidedunderWIA.

21Originally,underStructuredChoicecaseworkersweresupposedtoplayamoreactiveroleintrainingchoice.However,mostcaseworkersdidnotfeeltheyhadenoughknowledgeoflocallabormarketsortheworker’sskillstotakeonsuchanactiverole.

44

Anissuethatiscommontoallofthejobtrainingexperimentsisthe

possibilitythatindividualsassignedtothecontrolgroupmayhavereceivedtraining

throughotherchannelsthatwouldnotnecessarilyhavebeentrackedinthe

experimentaldata.Thesehiddentreatmentsarelikelytoattenuatetheestimated

trainingeffects–insofarascontrolparticipantsarereceivingsubstitutetreatments,

theevaluationsidentifyonlythedifferentialeffectofthepublictrainingprogram,

ratherthantheoveralleffectoftrainingrelativetonone.Whilethiscouldpartly

explainlowestimatedtreatmenteffects,thishasnotbeenexaminedcarefullyinthe

literature(though,aswediscussbelow,ithasreceivedsubstantialattentioninsome

otherdomains,mostnotablytheevaluationofearlychildhoodeducation).

Althoughabroadrangeoffindingsfromdifferenttreatmentsmakesithard

togeneralize,twothemeshaveemergedfromtrainingprogramsocialexperiments.

First,whiletrainingforlessadvantagedadultsandtheunemployedcanhave

beneficialeffects,mosttrainingprogramsfordisadvantagedyouthsfailtoachieve

strongresults.AnimportantexceptionisJobCorps,whichhasshownshort-and

medium-termpositiveeffectsforatleastsomeofitsparticipants.Second,theeffects

oftrainingtendtoaccruegraduallyovertime,makingthemhardtodetectin

researchdesignsthatcombinemultipletreatmentsorthatdonothavesufficient

dataorsamplestopreciselyestimatemedium-tolong-termeffects.

c. JobSearchAssistance

FromtheinceptionofwelfareprogramsintheU.S.itwassuspectedthat

neitherbetterworkincentivesnorbetterhumancapitalwouldbesufficienttoplace

45

hard-to-employwelfarerecipientsordisadvantagedyouthintolastingemployment,

andthatpartofthechallengederivedfromdisconnectionfromtheworldofwork.

Atthesametime,itwasnotclearwhichofarangeofsupportservicesaidingjob

placementwouldbeeffective.Hence,alargenumberofRCTshaveevaluatedarange

ofjobsearchassistance(JSA)programsforlow-incomeworkersandyouth.Other

studieshavefocusedonunemploymentinsurancerecipientsandotherunemployed

workers,whohavetraditionallybeeneligibleforsearchassistancefromtheU.S.

government.Hence,whiletrainingevaluationshavemostlyconcernedprograms

aimedatlow-incomeworkers,jobsearchassistanceexperimentshaveevaluated

programsgearedtowardsawiderrangeofunemployedworkersfromthemid-

1970stotoday.Asintrainingevaluations,however,animportantchallengein

studiesofjobsearchassistanceismeasuringthecounterfactual:Whatsortof

assistance,ifany,wasreceivedbythoseexcludedfromtheprogramunderstudy?

AnearlywaveofJSAprogramexperimentsgearedtowardswelfare

recipientsoccurredfromtheearly1970stothemid-1980s,alongsidesimilar

studiesoflaborsupplyandtrainingprogramsaimedatthesamepopulation.These

weremostlyevaluationsoflocalprogramsfundedbythefederalgovernment.There

isalonghistoryofprogramsprovidingplacementandtrainingservicesforwelfare

recipientsintheUnitedStates,goingbackatleasttotheWorkIncentiveProgram

(WIN)initiatedin1967.WINwascriticizedonarangeoffronts(e.g.,Gold1971).

Thefirstwaveoffederally-fundedevaluationstestedservicesprovidedbytheWIN

programandalternativeprogramsforWIN-eligiblewelfarerecipients(e.g.,

GrossmanandRoberts1989).TheseculminatedintheNationalEvaluationof

46

Welfare-to-WorkStrategies(NEWWS)in1990,whichwasalarge-scaleevaluation

of11programscombiningJSA,training,andenforcementofjobsearch

requirementsin7differentsitesintheU.S.

TheresultsfromrandomizedevaluationofdifferentWINserviceswere

mixed(e.g.,GreenbergandShroder2004).Theevaluationofso-called“jobclubs”in

1976-1979showedsubstantialincreasesinemploymentandreductioninwelfare

receipt.Asresult,jobclubsbecameanintegralpartofservicesreceivedbywelfare

recipients.However,theevaluationwasbasedonarelativelysmallsample,follow-

upwaslimitedtooneyear,andtheresultsindicatedsubstantial,hard-to-explain

heterogeneityinthefindingsacrosssubgroupsandtreatmentsites.Incontrast,the

evaluationsdiscussedinGrossmanandRoberts(1989)showlessconsistenteffects

ofJSAundertheWINprogram.

ThemuchlargerevaluationofNEWWSfoundshort-termincreasesin

employmentandreductionsinwelfarereceipt.Theseeffectsdissipatedduringthe

fiveyearfollow-upperiod.Asinotherevaluationsoccurringintheearlytomid-

1990s,suchasGAINdiscussedabove,thismaybedueinparttothehigh-pressure

labormarketofthe1990s.Thepresenceofsuchcyclicaleffectsisapotentially

importantconfounderlimitingtheinterpretationoftheeffectsoflabormarket

programstudies.

Asecondwaveofexperimentsoccurredintherun-uptowelfarereformin

themid-1990s,andagainsawsubstantialstate-levelinvolvement.Aswithlabor

supplyandtrainingstudiesinthisperiod,thesestudiestendedtostudy

contemplatedchangestoexistingprogramsandtoinvolvelargesamples.These

47

includedProjectIndependenceinFloridain1990(over13,000treatmentand4,000

controlsubjects),theIndianaWelfareReformEvaluationin1995(over67,000

treatmentand4,000controlsubjects),andtheLAJobsFirstGAINevaluationin

1995(over15,000treatmentand5,000controlsubjects).22Amongthese,onlythe

GAINevaluationdiscussedaboveallowsinferenceabouttheroleofJSAalone.The

findingsconfirmsthatJSAcanyieldsubstantialgainsinemployment,atleastinthe

shortterm.

Inparallel,anothergroupofexperimentsevaluatedJSAservicesprovidedto

recipientsofunemploymentinsurance.Mostoftheseincludedacombinationof

directjobsearchassistance,instructionsonhowtosearchforajob,andverification

ofjobsearch.Theseexperiments,toalargeextentdiscussedinMeyer(1995),

includedNevada(1977,1988),Charleston(1983),Texas(1984),NewJersey(1986),

andWashingtonState(1986).Anothersetofexperimentsduringsameperiod,

assessedonlytheeffectofverificationofjobsearchrequirements.Ashenfelter,

Ashmore,andDeschenes(2005)discussexperimentsinConnecticut,Massachusetts,

Tennessee,andVirginia.23

AssummarizedbyMeyer(1995),acorefindingofthesestudiesisthatJSA

reducesunemploymentinsurance(UI)receipt,atleastintheshortrun.Theeffects

aresmall,butcosteffectivefromthepointofviewoftheUIagency.Theeffectson

earningstendtobeimprecise,consistentwiththepossibilitythattheprogram

22TherealsohavebeenevaluationsofJSAservicesexplicitlydirectedatlow-incomeyouth,butmostsuchRCTsthatwefoundwererelativelysmall.TheevidenceonthissubjectquotedmostfrequentlyisrelatedtothejobsearchcomponentprovidedintheJTPAandJobsCorpsprograms.23OthersuchexperimentsincludeMinnesota(1988),Maryland,(1994),andWashingtonD.C./Florida(1995-1996),seeGreenbergandSchroder(2004).

48

impactsderivefromworkerswholeavetheUIsystemwithoutfindingjobs.Littleis

knownaboutwhichcomponentsofJSAmatter.ExperimentsinNevadaand

MinnesotasuggestthatintensiveJSAhasmuchstrongereffectsthandomore

limitedtreatments.Thereismixedevidenceastowhethertheverification

requirementalonematters:TheexperimentsdiscussedinAshenfelteretal.(2004)

indicatenoeffects,whileaMarylandstudysummarizedinKlepinger,Johnson,and

Joesch(2002)did.Thisquestionisakeyaspectofongoingevaluationsofthe

ReemploymentandEligibilityAssessmentsystem,discussedbelow.

SincethisearlywaveofUIexperiments,thecomponentoftheUIsystem

offeringjobsearchassistanceandtraininghasbeenrepeatedlyreformed,with

severalevaluationsalongtheway.TheWorkerProfilingandReemployment

Services(WPRS)programwasinstitutedin1993.UndertheWPRSstatesare

requiredtoprofiletheirUIclaimantsinordertoidentifythosemostlikelyto

exhaustUIbenefitsandreferthemtoemployment-relatedservices.24Thisprogram

wasevaluatedviaanaturalexperimentinKentuckybeginningin1994(Black,

Smith,Berger,andNoel2003,Black,Galdo,andSmith2007).Thefindingsfromthe

WPRSstudysuggestthatreceivingaletteraskingindividualstocomeintotheoffice

forJSAservicesalonereducesUIreceiptandraisesearnings.Animportantopen

24Theservicesinclude(1)anorientationsessiontoexplainwhatreemploymentservicesareavailable;(2)anassessmentoftheclaimant’sspecificneeds;and(3)developmentofanindividualplanforservicesbasedontheassessment.Claimantsreferredtoreemploymentservicesmustparticipateinthemasaconditionofcontinuingeligibility.Allowableservicesincludejobsearchassistanceandjobplacementservices,suchascounseling,testing,andprovidingoccupationalandlabormarketinformation;jobsearchworkshops;jobclubsandreferralstoemployers;andothersimilarservices

49

questioniswhetherthisinfluentialfindingisreplicatedinatrueRCTandinless

favorablelabormarketconditions.

TheWorkforceInvestmentAct(WIA)of1998combinedmostjobplacement

servicesandtrainingservicesprovidedundertheauspicesofthefederal

governmentunderoneroof,theso-calledone-stopcenters(e.g.,Jacobson2009).

Thesecenters,renamedAmerica’sJobsCentersin2012,provideboth“core”

employmentservices(e.g.,jobsearchassistance)and“intensive”WIAservices(e.g.,

careercounselingandtraining)tothethreecoreconstituencies–unemployed

worker,welfarerecipients,andhard-to-employyoungworkers.

Asthestructureofserviceprovisionhasevolved,additionalRCTshave

evaluatedthesystem’seffectivenessatplacingworkers.Forexample,in2005the

DepartmentofLabor’sEmploymentandTrainingAdministrationlauncheda

programcalledReemploymentandEligibilityAssessment(REA),mandatoryin-

personvisitsaimedatspeedingthereconnectionofUIclaimantstotheworkforce.25

TheREAmeetingincludesaneligibilityreview,provisionoflabormarket

information,developmentofareemploymentplanandreferraltomorespecific

reemploymentservices.Thefirstwaveofrandomizedevaluationofthe

effectivenessoftheREAcounselingprocesstookplaceinninestatesbeginningin

2005;asecondwaveofevaluationstookplaceinfourstatesin2009.Inbothcases,

theevaluationsfoundthattheREArequirementandservicesreduceUIbenefit

25TheREAprogramwasinstitutedtocounteractthetrendtowardsprocessingofUIclaimsbytelephoneandtheinternet.Theconcernwasthattheneteffectofthesechangeswastoreducein-personcontactandhencetheopportunitytomonitorjobsearchactivityandorientUIclaimantstoservicesavailabletospeedtheirreemployment(e.g.,O’Leary2006)

50

receipt(Benusetal.2008,Poe-Yamagataetal.2011).Earningsoutcomeswere

studiedinonlyonestate(Florida),andwerepositive.AnongoingREAevaluation

examinesthedifferenceintheeffectofenforcingtheinterviewrequirementalone

relativetothecombinedeffectoftheinterviewplusservices(Klermanetal.2013).

Asimultaneousevaluationbegunin2011,theWIAAdultandDislocatedWorker

ProgramsGoldStandardEvaluation26complementstheevaluationsofREA,WPRS

andearlierJSAprogramsbyfocusingontheeffectivenessofWIA’sintensiveand

trainingservicesgearedtounemployedadultsnotcoveredbytheearlier

evaluations.

SummarizingthewiderangeofstudiesofJSAindicatesimportant

heterogeneityofeffectsbythepopulationtargeted.Forwelfarerecipients,a

difficultyinassessingtheeffectofJSAisthatmanyexperimentstestedJSAin

conjunctionwithotherprograms.Thosestudiesthatfocusmainlyontheeffectsof

JSA,suchastherandomizedevaluationsofWIN,NEWWSorGAIN,oftenfind

positiveeffectsonemploymentandearningsandnegativeeffectsonwelfarereceipt

(butmixedeffectsatbestontotalincome).Theseeffectstendtobeshort-runlived,

andlessisknownaboutthelonger-termoutcomes.Thereisalsolittleknownabout

thepotentiallyimportantroleplayedbycontext,suchaslocallabormarket

conditions.

InstudiesofJSAforUIrecipients,acommonresultisapreciselyestimated

butrathersmalleffect–e.g.,areductionofaboutoneweekofUIbenefits,withno

26Seehttp://www.mathematica-mpr.com/our-publications-and-findings/projects/wia-gold-standard-evaluation.

51

correspondingpositiveeffectonearnings–unlesstheservicesprovidedarevery

intensive.Thefrontierinthisareaisassessingtowhatextenttheseeffectsarise

fromthethreatofenforcementofservicerequirementsspelledoutbylaw,basicJSA

themselves,ormoreintensiveservices.

d. PracticalAspectsofImplementingSocialExperiments

Clearly,theimplementationoflarge-scalesocialexperimentsiscomplexand

facesarangeofpracticalhurdlesthatcanaffectthequalityoftheresults.Sections

II.candIVofthispaperfocusonanumberofdesignissuesthatcanlimittheability

ofevenanidealexperimenttoprovideanswerstothequestionsofinterest.

Beyondtheseconceptualdesignissues,therearesomecommonchallenges

andpracticalconsiderationsthathavecomeupoverandoverintheconductof

socialexperimentsinthelabormarket.Theseplayimportantrolesininfluencing

thetopicsandquestionsthatarestudiedviasocialexperimentsandininformingthe

studydesigns.

Onesetofchallengesderivesfromthefactthat,asnotedabove,oneofthe

definingcharacteristicsofsocialexperimentsisthattheyintendtoexamine

programsthatarealreadyinplaceormightbeputinplaceinessentiallythesame

formthatwasusedintheexperiment.Forthispurpose,theexperimentalsamples

andhencethesamplingframeneedtoberepresentativeofthepopulationthatthe

programserves.Thisisachallengeinthecaseofmanylabormarketprograms,in

52

partbecausethesamplingframeisoftenavailableonlytoprogramoperatorsorthe

government,andmaybedifficulttoaccessduetoformalapprovalprocesses.

Oncethesamplingframeisobtained,itisnecessarytorandomlyassignsome

membersofthesampletotheprogramofinterestandotherstoacontrolcondition,

whichmightbeexclusionfromtheprogramoranalternativeprogramdesign.This,

too,canbedifficultwhentheprogramisalreadyinplace.Forexample,ifthe

programinquestionexistswithinanecosystemofotherprograms,services,and

serviceproviders,itmaybehardtoexcludeparticipantsfromtheprogramor,ifthis

isdone,toavoidalsoexcludingthemfromotherprogramsthatareadministratively

integrated.Forexample,excludingaparticipantfromjobsearchassistanceoffered

undertheWorkforceInvestmentAct(WIA)mightalsoinpracticeexcludehimor

herfromjobtrainingandotherprograms,asthesameofficesthatprovidejob

searchassistancealsodoscreeningandreferralsforotherservices.Whilesomeof

theseproblemsmightbereducedbystudyingprogramsnotalreadyinplace,asin

thecaseoftheNegativeIncomeTaxexperimentsortheNationalSupportedWork

Demonstration,thiscanbequitecostly,asthesortsofprogramstypicallystudied

involvesubstantialprogramcosts–commonlyinthethousandsofdollarsper

participant.

Asecondgroupofchallengeshastodowiththedifficultyofenforcing

compliancewithrandomizationafteritisconducted.Again,theuseofactual

programstestedinreal-worldsettingslimitstheoptions.Acommonchallengein

earlyexperimentswasthatservicedeliverywasdelegatedtoindividualcase-

workersorsitesthatwerebothwidelydispersedandnotcloselyinvolvedwiththe

53

experimentaldesign.Thisraisesthepossibilitythatcaseworkersmaydeviatefrom

randomassignment,forexampleensuringthatapotentialparticipantviewedas

especiallyneedyisnotassignedtothecontrolgroup.Forexample,akeyconcernin

theNationalJobCorpsStudywastoensurethatlocalprogramoperatorsproperly

implementedtherandomization.Modernpracticecentralizestherandom

assignmentprocess,carefullytrackingparticipants’initialassignmentstoensure

thatparticipantsassignedtoundesirabletreatmentconditionsdonotre-enterthe

randomizationtoobtainabetterassignment.27

Athirdsetofchallengeshastodowiththemeasurementofparticipant

outcomes.Onceagain,thischallengederives,inlargepart,fromtheuseofreal-

worldpopulationsasexperimentalsubjectsandfromthelargeandheterogeneous

subjectpoolscommoninsocialexperiments.Thesemakeitmoreexpensiveto

ensurehighresponseratesthaninsmallerandmoretargetedfieldexperiments.

Inmanycasesthischallengecanbeaddressedbyusingadministrativedata

tomeasuresomeoutcomes.Administrativerecordsmaycomeeitherfromthe

programunderstudy–forexample,unemploymentinsurancepaymentrecordsfor

studiesofjobsearchincentivesforunemploymentinsurancerecipients–orfrom

otherrecordsfromothergovernmentprograms(e.g.,taxrecords).Whilethiscan

resolvetheattritionproblematlowcost,itisoftencontingentongovernment

cooperationorapproval.Suchcooperationismorelikelyinlarge-scalesocial

experimentalevaluationsofexistingprogramsthaninothertypesofstudies.

27Foradiscussionofapproachestoaddressthisproblem,includingrelatedsoftware,see,e.g.,Creponetal(2013).

54

Administrativedatacanalsolimitthesetofimpactsthatcanbestudied,potentially

creatingimportantambiguitiesintheinterpretationofestimatedtreatmenteffects.

Intheunemploymentinsurancecase,forexample,itisnotclearwhetheranegative

effectofincreasedjobsearchenforcementonunemploymentbenefitpayments

indicatesthatpeoplearefindingjobsfaster,orjustthatmanypeopleareleavingthe

programbeforefindingjobsasawayofavoidingonerousenforcementprocedures.

IV. GoingBeyondTreatment-ControlComparisonstoResolveAdditional

DesignIssues

Whetheroneisinterestedinstructuralparametersorprogramevaluation,

manyquestionsofpolicyorscientificinterestinlaborandpubliceconomicsrequire

goingbeyondthebasicRCTdesigndescribedinSectionII.a.Wediscussedanumber

ofthesequestionsinSectionII.c.Here,wediscusswaystoextendthebasicRCT

designtoprovideanswerstothesequestions.

Weorganizeourdiscussionaroundthemajorpotentialdesignissueswe

mentionedinSectionII.c.Foreach,wediscussproposedsolutionsand,where

relevant,pointoutpotentialextensionsandlimitations.Webeginbydiscussing

studiesthataddressaspectsrelatingtointernalvalidity,includingSUTVAviolations

(e.g.,potentialgeneralequilibriumeffects)andendogenouslyobservedoutcomes.

Wethendiscussstudiesthataddressexternalvalidityconcerns,includingsiteand

sub-groupeffects;effectsonsubpopulationsotherthanexperimentalcompliers;

hiddenormultipletreatments;mechanismsfortreatmenteffects;andstudiesof

optimalorsimplyalternativepolicies.

55

Insomecases,theidentifiedissuescanbeaddressedexpost(afteran

experimentiscomplete),generallybyimposingadditionalstructure.Inmanyof

theseexamplestheadditionalstructureimposedisjustifiedbyappealtotheoretical

considerationsandisjustsufficienttoextendtheRCTtoaddressaspecificquestion

andthedesignissueitraises.Inthatsense,thestudiescanbeviewedasaneffortto

bridgepureexperimentalorquasi-experimentalapproaches,crediblyidentifyinga

limitednumberof(potentiallycomposite)causalparameters,withmoretraditional

structuralestimationthatobtainsafullercharacterizationoftheeconomicproblem

viatheimpositionofsubstantialadditionalassumptions.Intheidealcase,they

maintainthebestofbothworlds,thoughtheyalsosharesomeofthelimitationsof

each.

Anotherpossibilityistobuildthestructuralquestionsofinterestintothe

designoftheexperimentexante.Thiscanprovidecredibleidentificationwitheven

fewerstructuralassumptionsthanarerequiredforafter-the-factanalyses,though

cansometimesrequireaquitecomplex–andpotentiallydifficulttoadminister–

experimentaldesign.Therearefewerexistingexamplesofthis,butwediscussthem

whereappropriate.

Wediscusseachofthedesignissuesidentifiedearlierinturn.Ourdiscussion

ismeanttohighlightthedifferentapproaches,aswellastoclarifythescope,

potential,anddifficultiesthatarisewhenextendinginferencefromstandardRCTs

toabroaderrangeofquestions.

56

a. SpillovereffectsandSUTVA

Socialexperimentsinlaboreconomicstypicallyoccurinthecontextofthe

localorregionallabormarket.Ifthenumberofworkersparticipatinginthe

programislargerelativetotherelevantsegmentofthelabormarket,theprogram

couldhaveaneffectonthelabormarketoutcomesofthecontrolgroup.Thiswould

beaviolationofSUTVA–thedifferenceinoutcomesbetweentreatedandcontrol

individualswoulddifferfromtheoveralleffectoftheprogramontheentire

populationrelativetonotimplementingtheprogram,whichisoftentheeffectof

primaryinterest.

ManysocialexperimentsintheUnitedStateshavenotraisedserious

spilloverissues,asthetreatedpopulationshavebeensmallrelativetothelocal

labormarket.However,thismaynotbetrueforlargeexperiments,suchasthe

NationalJobsCorpsStudy.Welfareexperimentsmayalsocreatespillovereffectsif

labormarketsforformerwelfarerecipientsaresufficientlysegmented.

Arelatedissueisthatcomprehensiveprogramevaluationsinmanycases

shouldincludespillovereffectsthatarenotcapturedbysmall-scalepilotstudies.If

thepilotprogramsareeventuallyscaledtobroaderpopulationsoflow-income

workers–whichhashappened,amongothers,inthecaseofwelfarereform,of

trainingprovidedthroughWIA,orjobsearchassistanceservicesprovidedbyWPRS

orREA–thenthepotentialextentofspillovereffectswouldneverthelessmatter,

sinceanyspillovereffectwouldhavetobeincludedinawelfareassessmentofthe

program.Thiswouldcreatesystematicdifferencesbetweentheoutcomesofthe

pilotstudyandtheprogrameffectsofinterest.

57

i. Addressingtheissueexpost

Despiteitspotentialprevalenceinsocialexperimentsinthelabormarket,

relativelyfewstudieshavedealtdirectlywiththeissueofspilloversorotherfailures

ofSUTVA.Ahandfulofstudieshavetriedtoestimatespillovereffectsdirectlyusing

inter-regionalcomparisons(e.g.,Blundell,Dias,Meghir,andVanReenen2004;

Ferracci,Jolivet,andvandenBerg2010;Gautier,Muller,Rosholm,Svarer,andvan

derKlaauw2012).Thereareroughlytwoapproaches,neitherofwhichisableto

fullyidentifythespillovereffect.Oneapproachistocomparecontrolgroup

outcomestothoseofobservablysimilarindividualsinareaswherenooneis

treated.Ofcourse,theremaybeotherexplanationsfordifferencesseeninthis

observationalcomparison.Anotherapproachistocomparetheeffectoftreatment

acrosssiteswithdifferenttreatmentintensityorlabormarketconditions.Thisis

againtypicallyanobservationalcomparison,asinmostcasesneitherthetreatment

sitenorthesizeofthetreatmentgroup(andhencetheamountofpotential

spillover)israndomlyassigned.Forexample,Hotz(1992)discussesthenon-

randomselectionofsitesfortheJTPAevaluations.Alcott(2015)studiesthesources

ofobservedbiasfromsite-selectioninalargeelectricityconservationexperiment.A

recentpaperbyCrepon,Duflo,Gurgand,Rathelot,andZamora(2013;seealsoBaird

etal.2015),discussedfurtherbelow,resolvesthisprobleminthecontextofajob

searchassistanceprogrambyrandomlyassigningboththetreatmentandthe

numberofworkerstreated.

Absentsuchamulti-stageexperimentaldesign,relativelyfewoptionsare

availabletoresearcherstoassessthedegreeoftheactualorpotentialspillover

58

effectspresentinthecontextoftheirevaluation.Anareaofresearchwherespillover

effectshavereceivedsubstantialrecentattentionistheanalysisoftheemployment

andwelfareimpactsofextensionsinunemploymentinsurancebenefits.Here,

spillovereffectsarisebecausetreatedanduntreatedindividualscompeteforthe

samepositions;thedegreeofthespillovereffectthereforedependsonthejob

creationresponsetothetreatedgroup’slaborsupplychange.Toassessthe

potentialdegreeofspillovers,onecaninprincipleuseestimatesofthematching

functiontoadjustmicro-econometricestimatesoftheeffectofpolicy-induced

changesinunemploymentinsurancedurationsonunemploymentdurationorexit

hazardsforthepresenceofcrowding.28Suchad-hocsimulationsarepartial-

equilibriuminnature,andcouldbeinterpretedasashort-runeffect,when

vacancieshavenotyetadjusted.Landais,Michaillat,andSaez(2015)specifya

generalequilibriummodelofthelabormarketthatincorporatesbothcrowdingand

vacancyresponses.Inastandard,competitivesearch-matchingmodel,thevacancy

responsetochangesinlaborsupplyissufficientlystrongtooffsetthecrowding

effectcompletely.

Inthespiritofusingrandomvariationinthetreatmentacrosslocalitiesto

assessthepresenceofspillovereffects,acoupleofrecentpapershavetriedto

exploitregion-specificchangesinpolicy-inducedUIvariationintheU.S.toassess

thefulleffectofthepolicyontheentirelabormarket(Hagedorn,Karahan,

Manovskii,andMitman2015,Hagedorn,Manovskii,andMitman2015).SinceUI

28OneaddeddifficultyinthecaseofUIisthatinmostcasesintheU.S.thepolicy-inducedchangesinthelevelordurationofUIbenefitsareafunctionoflabormarketconditions–makingitcrucialtoproperlycontrolforthedirecteffectoflocallabormarketconditions.

59

variationsusuallydependoneconomicconditionsatthestatelevel,thesestudies

usebordercommunitiesunaffectedbythepolicychangeascounterfactuals.29A

concernwiththisapproachisthatthepresenceofspatialspilloversbetween

adjacentorrelatedlabormarketareaswouldagainconstituteafailureofSUTVA.30

AnothersourceofSUTVAfailuresareinteractionsbetweentreatmentand

controlparticipants.Such‘dilution’effectscanleadtoanunderestimationofthe

treatmenteffect.Ifpossible,atypicalapproachtocircumventsuchinteractionsisto

raisethelevelofrandomization(say,fromasub-groupwithinasitetoawholesite).

Thisapproachcanhelptoavoidinteractionsbetweenindividualsinthetreatment

andcontrolgroups.Itdoesnotresolvepotentialinteractionsbetweentreated

participants.Thismaybepartofthemechanismofthetreatment;itmayalsobea

potentiallyunintendedsourceofvariationintreatmentintensitythatwediscuss

undersiteeffects.Ineithercase,whendesigninganevaluation,itwouldbevaluable

toconsiderwaysofkeepingtrackofsocialinteractions,perhapsbyaskingabout

friendsinabaselinesurvey,ormonitoring(ormanipulating)theuseofcertain

kindsofsocialmedia.Anothervaluabletargetfordatacollectionisfactorsrelating

tohowtreatmentwasobtainedortakeupwasdecided.Suchinformationmaybe

usedtostratifytheanalysisbythepredicteddegreeofSUTVAviolationsoratleast

assessthepotentialforsignificantdeparturesfromSUTVA.

29Akeypracticaldifficultythereisthatmeasuresofunemploymentratesatthesub-statelevelisoftenverynoisy.Estimatesusingadministrativeemploymentdatabasedontheuniverseofprivateemployeesshowlittlesignofspillovereffects(JohnstonandMas2015).30CerquaandPellegrini(2014)developalternativeestimatestotheTOTthattakeintoaccountthedegreeofspatialspillovereffects.TheHagedornetal.papershavebeenquitecontroversial;see,forexample,responsesfromChodorow-ReichandKarabarbounis(2016)andCoglianese(2015)

60

ii. Addressingtheissueexantethroughthedesignoftheexperiment

Insomecircumstancesitmaybepossibletoavoid,orstudy,spillovereffects

byappropriatelystructuringarandomizedexperiment.Forexample,inthespiritof

thenon-experimentalstudiescitedabove,treatmentandcontrolgroupscouldbe

chosentobesufficientlydistanttoavoidspillovereffects.Alternatively,the

treatmentgroupcouldbechosentobesufficientlysmallthatspillovereffectsare

unlikelytobeaproblem.Ifthespillovereffectsthemselvesareofdirectinterest,the

experimentalmanipulationcouldbecombinedwithpre-existingvariationinthe

strengthofpotentialspillovereffects(e.g.,acrosssubmarkets),ifavailable.Therisk

ofsuchadhocorhybridapproachesistopotentiallylosecomparabilityofthe

controlgroup,ortoconfoundspilloverwithothervariationintreatmenteffects.

Apreferableapproachifspillovereffectsarepotentiallypresentisto

manipulateboththetreatmentandthesizeofthetreatmentgroup(andhencethe

amountofspillover)experimentally.Bairdetal.(2015)developthisstrategy

formally.Crepon,Duflo,Gurgand,Rathelot,andZamora(2013)implementitinthe

contextofapublicprogramassistingunemployedworkersintheirsearchforajob

inFrance.Theresearchersmanipulatebothwhogetsassignedintothejobsearch

assistanceprogramwithinaregion(theclassicexperimentaldesign),aswellas

randomlyvarybetweenregionstheshareofindividualsassignedtothetreatment

group.Themanipulationofbothregionaltreatmentshareandindividualtreatment

statusallowsseparateexperimentalidentificationoftheeffectoftheprogram

holdingthespillovereffectconstantandthecombinedprogramandspillovereffects

atvarioustreatmentintensities.Thelatterparametersareultimatelyrelevantfora

61

cost-benefitorwelfareanalysisoftheprogramandforextrapolationtoalternative

policysettings.

SimilarstrategiesareavailableforotherSUTVAfailures,arisingforexample

ifsomeindividualsinthecontrolgroupgetaccidentallytreated,oriftreatment

compliancedependsonthetakeuprateamongpeers.Insomecases,onemay

choosetheexperimentalsettingtotrytominimizeSUTVAproblems.Forexample,

onecandevisestrategiestolimitthepotentialfornon-compliance(e.g.,incaseof

web-basedinformationtreatments,accesscouldbasedonhardwareaddressrather

thanpasswords).

Anotherpotentiallyinterestingstrategyistomakethedegreeandstructure

ofSUTVAviolationspartoftheanalysis,asinthediscussionofspilloversabove.This

mayprovideinsightsintothe“blackbox”ofhowaprogrammightworkinareallife

settingandhenceenhanceexternalvalidity.31Forexample,onecould

experimentallyvarythenumberoftreatedunitsinareferencegroupornetwork

(e.g.,classrooms,friends,etc.),examininginteractionsamongindividualtreatment

status,grouptreatmentshare,andperhapsalsopredeterminedfactors(suchasthe

tightnessofthegroup)thatdeterminethedegreeofdeparturefromSUTVA.

Dependingonthecontext,itmaybepossibletomoreexplicitlymanipulate

interactionsbetweenindividualsbyintroducinganadditionaltreatmenttothe

experimentaldesign–forexample,aforuminwhichinteractionsarefacilitated.

31Notethatthereisaparallelherewiththeissueoftreatmentcomplianceandheterogeneoustreatmenteffects.Here,thecompliancefunctionisassumedtodependontreatmentstatusofotherindividuals,andhenceexperimentallymanipulatingcomplianceprobabilitiesispresumablymorecomplex.Yet,asinthestandardcaseofheterogeneoustreatmenteffects,forexternalvalidityitisimportanttotraceoutthepotentialcompliance-relatedinteractionsasfullyaspossible.

62

b. Endogenouslyobservedoutcomes

Inmanylabormarketexperiments,keyoutcomesincludemeasures

observedonlyforindividualswhoareemployed,suchashoursworkedandwages.

Hence,theimpactof,say,welfare-to-workprogramsorjobtrainingprogramscan

onlypartiallybeassessedbasedonsimpleRCTsalone.Althoughmanystudies

reportexperimentalimpactsontheendogenouslyobservedoutcomes,theseare

understoodtosufferfromseriousselectionproblems.Inthesameway,non-random

attritioninfollow-updatacollectioncanbiastheresultsofnearlyanyevaluation.

Toillustrate,consideraprogramaimedatunemployedworkersthatincludes

skilldevelopmentandjobsearchassistancemodules.Weareinterestedinwhether

theprogramraisestheprobabilitythataparticipantisemployedoneyearafter

participationandwhetheritmakesthemmoreproductivewhenemployed.For

simplicity,weassumethatparticipationisrandomlyassignedandcomplianceis

perfect.

Wehavetwooutcomeshere.Wedenoteemploymentstatusbyyi=Diy1i+(1-

Di)y0i.Forthosewhoareemployedatthefollow-upsurvey,weobservethewagewi

=Diw1i+(1-Di)w0i.Treatmenteffectsoftheprogramonthetwooutcomesareτyi

andτwi.(Wecanimaginethatwdiiswelldefinedforanindividualwithydi=0,

d={0,1},butsimplynotobserved.Itcanbethoughtofastheindividual’slatent

productivity,thatwhichhe/shewouldbepaidifajobwerefound.)

EstimationofE[τyi]isstraightforward,asdiscussedabove.Buttheimpacton

wagesismuchharder.Ingeneral,itisnotpossibletoidentifytheaveragetreatment

effectE[τwi];thetreatment-on-the-treatedeffectE[τwi|Di=1];oreventheaverage

63

treatmenteffectforthesubpopulationthatwouldhavebeenemployedwithor

withouttheprogram(forwhomτwiisleastproblematic),E[τwi|y0i=y1i=1].

Theproblemhereisthatitisimpossibletodistinguish,withineachDigroup,

betweenthoseworkerswhowouldalsohaveworkedinthecounterfactualand

thosewhowouldnothave.Considerthetreatment-controldifferenceinmean

observedwages:

E[wi|y1i=1,Di=1]-E[wi|y0i=1,Di=0]=

=E[w0i+τwi|y1i=1,Di=1]-E[w0i|y0i=1,Di=0]

=E[τwi|y1i=1,Di=1]+(E[w0i|y1i=1,Di=1]-E[w0i|y0i=1,Di=0])

=E[τwi|y1i=1,Di=1]+

+(E[w0i|y0i=1,y1i=1,Di=1]-E[w0i|y0i=1,y1i=1,Di=0])

+(E[w0i|y1i=1,Di=1]-E[w0i|y0i=1,y1i=1,Di=1])

-(E[w0i|y0i=1,Di=0]-E[w0i|y0i=1,y1i=1,Di=0]).

Thefirsttermhereistheaveragetreatmenteffectinthesubpopulationthatworks

undertreatment.Itmaynotequaltheoverallaveragetreatmenteffect,butinsofar

asthepotentialwagesofthosewhodonotworkarenotrelevanttosocialwelfare,it

isarguablytheparameterofinterest.Thesecondtermsolelyreflectsselectioninto

treatment,andiszerounderrandomassignment.Butthethirdandfourthterms

havetodowithselectionintoemployment,notselectionintotreatment.Random

assignmentdoesnotensurethattheyarezero,andthetreatment-controlcontrast

64

amongworkersmaythereforebebadlybiasedrelativetotheimpactonwagesfor

anyfixedgroupofworkers.32

Onefallbackapproachistoexamineonlytheprogram’seffectontheshareof

participantsearninghighwages,treatinglow-wage-workersandnon-workersthe

same.Thiseffectcanbeestimatedwithoutbias.Anotherfallbackistoincludethe

non-employedinthewageanalysis,withwagessettozero.Thisinsomecasesisthe

impactofinterestinanycase,andiscorrectlyidentifiedbytheexperiment.

However,itisquitemisleadingifinterpretedasthemagnitudeoftheeffecton

productivity,eitherforthefullpopulationorforthesubgroupthatwouldhavebeen

employedwithorwithouttreatment.Withoutanabilitytomeasurecounterfactual

employmentstatusattheindividuallevel,thelattereffectsarenotidentified.


Non-randomattritioninparticularhasbeenalong-standingconcerninthe

experimentalliteratureinlaboreconomics(e.g.,HausmanandWise1979).Aclassic

experimentaldesignwouldbedeemedsuccessfulifattritionislowandbalancedin

termsofmagnitudeandobservablecharacteristicsbetweenthetreatmentand

controlgroups.Ifthisisthecase,reweightingthesamplesmaystillrecoverthe

32Consideratrainingandjob-searchassistanceprogram.Suppose60%ofworkerswillbealwayslowproductivity(w1i=w0i=wL),20%willbealwayshighproductivity(w1i=w0i=wH),and20%willbecomehighproductivityifexposedtothetrainingsequence(w0i=wL,w1i=wH).Allofthesecondandthirdgroupswillfindjobs,withorwithoutsearchassistance(y0i=y1i=1),butthoseinthefirstgroupoflow-skill,impossible-to-trainworkerswillfindworkifandonlyiftheyreceivesearchassistance(y0i=0,y1i=1).Inthissetting,theprogram’saveragetreatmenteffectonemploymentis0.6;theaverageeffectonlatentproductivityis0.2*(wH–wL);andtheaverageeffectonwagesofthosewhowouldworkwithorwithouttheprogramis0.5*(wH–wL).Theestimatedtreatmenteffectonwagesconditionalonemploymentis–0.1(wH–wL)<0.Selectionhasledtoaperverseestimatehere:Thetrainingprogramhasapositiveeffecton20%ofparticipantsandanegativeeffectfornoone,buttheexperimentappearstoindicatethatitreducesearnings.

65

effectoftheTOTorLATEamongtheoriginalsetofcompliers(e.g.,HamandLi

2011).Yet,therearerelativelyfewexplicitattemptsintheliteraturetoaddress

selectionbiasinothercontexts.

Alargeliteratureinlaboreconomicshasdealtwithsampleselection

problems,especiallyintheanalysisofwagesandhoursinthecontextoftheclassic

humancapitalandlaborsupplymodels.Largelybasedonthatliterature,herewe

willreviewseveralapproachestodealwithselectionbias:theuseofcontrol

functionstoaddressselection;estimationofpercentileeffectsinsteadofmean

impacts;useofadditionaldatatocontrolforselection;constructionofboundsbased

onselectionprobabilities;andconstructionofboundsusingtheory.

Parametricselectioncorrections

The‘classic’approachtocontrolforselectionbiasinestimatingtheeffectsof

treatmenteffectsonwagesorhoursworkedisbasedoncontrolfunctions.Labor

supplytheory,alongwithparametricassumptions,isusedtoderiveanexplicit

expressionfortheselectionbiasintermsoftheparticipationprobability,which

undermonotonicitydeterminestheamountofsampleselection.Thisisthen

accountedfordirectlyintheoutcomeequation(e.g.,Gronau1973,Heckman1979).

Earlyonitwasrecognizedthatabsentexperimentalvariationin

participation(e.g.,anexogenousinstrumentaffectingonlyparticipationandnotthe

outcomeequation),identificationisonlybasedonfunctionalformassumptions,and

resultscanbequitemisleadingiftheseassumptionsareevenslightlyincorrect.By

contrast,asubstantialliteraturehasshownthatonceaninstrumentfor

participationisavailable,treatmenteffectsintheoutcomeequationcanbe

66

identifiedunderquitegeneralfunctionalformanddistributionalassumptions(e.g.,

Newey,Powell,andWalker1990).Forexample,AhnandPowell(1993)showthat

underassumptionsofasingle,strictlymonotonicindexforselection,variationin

theprobabilityofparticipationindependentfromthevariablesintheoutcome

equationsufficestocontrolforselection.Thedifficultyis,ofcourse,thatoftensuch

independentsourceofvariationisnotavailable.

CardandHyslop(2005)consideraspecialcaseinwhichanRCTdoes

generateexogenousvariationinparticipation:Anemploymentsubsidyprogram.

Theyshowthatiftheprogramonlyhaspositiveeffectsonlaborsupplyanddoesnot

affectthewagesforthosewhowouldhaveworkedwithoutit,thentheexperimental

effectonthehourlywagecanbeconsistentlyestimatedbytheratioofthetreatment

effectontotalearningsdividedbythetreatmentontotalhoursworked.

CardandHyslop’sassumptionsareinappropriateforanyprogramdesigned

toaffectwagesandnotjustparticipation.Belowwediscusshowtheexperimental

designitselfmaybemodifiedtoobtainexogenousvariationinparticipation,evenin

programswitheffectsonmultiplemargins.

Non-andsemi-parametricselectioncorrections

Absentaninstrumentforparticipation,inthepresenceofselectionthe

treatmenteffectonmeanwagesisnotidentified.However,severalstudieshave

exploitedthefactthatundercertainassumptionsquantile-treatmenteffects(QTEs)

maybeconsistentlyestimatedeveninpresenceofselection.AQTEfortheq-th

quantileisdefinedasthedifferenceintheq-thquantileoftheoutcomedistribution

67

inthetreatmentandcontrolgroups,respectively.33Itisnotnecessarytoobserve

eachindividual’soutcometocomputetheq-thquantile;itsufficestoknowthat

someoneisaboveorbelowthatquantile.Thus,ifonecanassumethatallthosewho

arenotemployedhavepotentialwagesinthebottomqpercentofthedistribution,

onecanestimatethetreatmenteffectontheqthquantileofpotentialwagesby

merelyassigningallnon-workerstheminimumobservedvalue(e.g.,Powell1984,

Buchinsky1994).Hence,underthisassumptionallquantilesabovethevalueofthe

rateofnonemploymentoftherespectivegroupcanbeidentified.Thelowervalueof

nonemploymentofthetreatmentandcontrolgroupdetermineswhichQTEcanbe

identified.

Avariantofthisapproachistoexaminethesimpletreatment-control

differenceintheprobabilityofbeingobservedinemploymentwithawagegreater

thansomerelativelyhighthresholdw*.Formanyprogramevaluations,

understandingtheimpactonthisoutcomemaybesufficient–itmaynotmatter

greatlywhethertheimpactderivesfrommovingsomepeoplefromnon-

employmentintohigh-wageemploymentorfromsimplyliftingthosewhowould

haveworkedanywayintohigher-wagejobs.Andevenwhenthelattercomponentis

theoneofinterest,thiswouldbeidentifiedsolongasthosepulledintoemployment

bythetreatmenthavewagesthatareuniformlybeloww*.

33ForanyrandomvariableYhavingcumulativedensityfunctionF(y)=Pr[Y<y],theqthquantileofFisdefinedasthesmallestvalue,suchthatF(yq)=q.IfweconsidertwodistributionsF0andF1,thenQTE(q)=y(1)-yq(0),whereyq(g)istheqthquantileofdistributionFg.

68

Itisnotclear,however,thattherequiredassumptionholds–aspointedout

byAltonjiandBlank(1999),amongothers,atanygiventime,somehigh-wage

individualsmaybenonemployed.Moreover,thisstrategyisonlyusefulinsofaras

differencesinquantilesoftheoutcomearedeemedsufficientforevaluatingthe

effectoftheprogram.

Anotherapproachusesreservationwagestomeasureselectionintothe

subsampleofobservedwages.Thisworksbecause–ifcorrectlymeasured–the

reservationwagecapturesthelowestwageforwhichanindividualiswillingto

work.Hence,thereservationwageprovidesthecensoringpointforanindividual’s

wage-offerdistribution,allowingonetomakeinferencesaboutpotentialwagesfor

thoseindividualsnotworkinginthetreatmentandcontrolgroup.Johnson,

Kitamura,andNeal(2000)usetheminimumofallobservedwagesforanindividual

inlongitudinaldatatoboundthereservationwage,undertheassumptionthatitis

stableovertime.Grogger(2005)usesdirectlyreportedreservationwage

informationfromarandomizedevaluationofFlorida’sFamilyTransitionProgram,a

welfare-to-workprogramwithemphasisonworkincentivesandtimelimits.With

thisinformation,heestimatesthetreatmenteffectoftheprogramonwagesusinga

bivariate,censoredregressionmodelthatallowsforclassicalmeasurementerrorin

bothobservedwagesandreservationwages.OnceGrogger(2005)controlsfor

selection,hefindstheprogramhadstatisticallysignificantlypositiveeffectson

wages.

Addressingtheselectionproblemusingdirectmeasuresofreservation

wagesmakesintuitiveuseofthereservationwageconcept.Moreover,often

69

informationonreservationwagesisalreadybeingcollectedinthecontextof

programsprovidingjobsearchassistance,orifnottheyareatleastinprinciple

relativelyeasytoelicitiftheexperimentaldesignincludesasurveycomponent.

However,recentresearchsuggeststhatinpracticereportedreservationwages

appeartoonlypartlyreflectthepropertiesofthetheoreticalconcept(e.g.Krueger

andMueller2016),castingsomedoubtontherobustnessofthisapproach.In

particular,KruegerandMuellerreportthatasubstantialnumberofworkersaccept

(reject)jobsofferingwagesbelow(above)theirreservationwage,implyingthat

careshouldbetakeninusingreservationwagesofthenonemployedtomake

inferencesaboutunobservedwageoffers.

Yetanotherapproachistoattempttoderiveboundsforthetreatmenteffect

underconditionsmoregeneralthanthemonotonicityassumptioninherentinthe

AhnandPowell(1993)andsimilarestimators.Thisallowsresearchersto

investigatehowseverethebiasfromselectioncouldpossiblybeandwhatcanbe

learnedundergeneralassumptionsratherthantotryandtoobtainapointestimate

undermorerestrictiveassumptions.

OneboundingapproachisproposedbyHorowitzandManski(2000).This

strategyaskshowmuchtheestimatedtreatmenteffectwouldbeinflatedifall

missingtreatmentobservationswereassumedtohavethehighestpossible

outcomesandallmissingcontrolobservationsthelowest;thenitaskshowmuchit

wouldbedepressediftheoppositeassumptionsweremade.Unfortunately,these

boundsaretypicallynotverytight,particularlywhentheoutcomevariable’s

supportispotentiallyunboundedasforexampleinthecaseofwages.

70

Lee(2009)proposesastrategyforobtainingtighterbounds,viastronger

assumptions:Heassumesthatanyonenotemployedinthecontrolgroupwouldalso

havebeennon-employedhadtheybeeninthetreatmentgroup,sothatselection

biasarisessolelyfromparticipantsitthetreatmentgroupwhoareemployedbut

wouldnothavebeenhadtheybeenassignedtobecontrols.34Hecanthenboundthe

treatmenteffectbymakingextremeassumptionsaboutthislattergroup.Denotethe

excessfractionemployedintreatmentgroupbyp.Theupper(lower)boundis

constructedbyremovingthelowest(highest)fractionpobservationsfromthe

treatedsubsampleandrecomputingthemeanoutcomeforthetreatmentgroup–

effectivelymakingtheworst-caseassumptionthatselectionwasfullyresponsible

fortheentireupperorlowertailofvalues.Lee(2009)showsthattheresulting

boundsaresharpandprovidesformulasforthestandarderrors.InthecaseofJob

Corps,theprocedureresultsininformativeboundssuggestingpositivewageeffects

fromtraining–albeitazeroeffectiscontainedintheconfidenceinterval.

Lee’s(2009)approachbasedontrimmingrequiresrelativelyweak

assumptions.Itpresumesonlythatselectionismonotonicinthetreatment–that

treatmenteitheronlyincreases,oronlyreduces,selectionintoemployment.

Monotonicityisimpliedbystandardempiricalbinarychoicemodelstypicallyused

tomodelparticipationchoices(e.g.,Vytlacil2002),andhenceboundsbasedon

trimmingareapplicabletoawiderangeofproblems,includingselective

employment,surveynon-responses,orsampleattrition.

34Theroleoftreatmentandcontrolgroupsarereversedifthetreatmentreducesemployment.

71

Ifoneiswillingtoimposefurtherstructurefromtheory,onemayobtain

tighterboundsmorespecifictoaparticularproblem.Thisisespeciallyusefulifthe

theoryhasexplicitpredictionsabouthowtheendogenousoutcomerespondsto

incentives.35ThisispursuedbyKlineandTartari(2016),whoanalyzethe

randomizedevaluationofConnecticut’sJobsFirstwelfare-to-workprogram.While

previousanalyseshadfoundonlysmallresponsesinhours(theintensivemargin),

absentaninstrumentforparticipation(theextensivemargin)sampleselection

makessuchestimateshardtointerpret.KlineandTartari(2016)userevealed

preferenceargumentsinthecontextofacanonicalbutnon-parametricstaticlabor

supplymodeltodescribewhichobservedresponsestothetreatmentatthe

intensiveandextensivemarginareconsistentwiththetheory.Giventhenatureof

theprogramstudied,theresultisamappingofdiscretecounterfactualoutcomes

(includingnon-participationaswellasparticipationatdifferentintensities)under

treatmentandnon-treatment,withrestrictionsontheallowablecounterfactuals.

Thequestionthenishowlikelycertaintransitionsare,andinparticularwhether

changesattheintensiveandextensivemarginoccurwithpositiveprobabilities.

SinceKlineandTartaricanonlyobservethemarginaldistributionacrossstatesfor

thetreatmentandcontrolgroups,theycannotpoint-identifythetransition

probabilities.Instead,theyconstructboundsfortransitionprobabilitiesamongthe

entire(discretized)distributionofstates,includingtheprobabilityofchangesinthe

35Thismaybemoreeasilydoneforhours,whichistypicallyassumedtobeachoicevariable,thanforwages.Yet,tosomedegreewagemaybeachoicevariableaswell,forexampleifjobsofferwageandeffortcombinationsamongwhichworkerschoose.Thisistheapproachtakeninsomemodernpublicfinance,whichoftensubstituteshoursworkedwithtaxableearningsasthechoicevariableinanalysesofintensive-marginlaborsupply.

72

intensivemarginduetothetreatment.Theirapproachalsoallowsthemtotestthe

restrictionsfromthemodel.

Thisapproachisuseful,sinceitallowsKlineandTartari(2016)tolearn

aboutintensivemarginresponsestotheJobsFirstprograminthepresenceof

selection.Theirresultscouldalsobeusedtothinkaboutthelikelihoodofintensive

marginresponsesforsimilarprogramsinsimilarpopulations.Alternatively,the

estimatedboundsfromthematrixoftransitionprobabilitiescouldbeused,along

withthemarginaldistributionoflaborsupplyunderanexistingprogram(AFDC,the

programofthecontrolgroup),toconstructboundsfortheintensiveandextensive

laborsupplyresponsesthatcouldariseifJobsFirstwasimplementedatanother

site.Apotentialissueisthattheprocedureiscomplexandtheanalysisisspecificto

theJobsFirstprogram.Hence,whilethegeneralapproachmaybeapplicabletoa

rangeofproblems,thiswouldrequirecarefulspecificationofthedecisionproblem,

oftherestrictionsimposedbyrevealedpreferencetheory,andofcounterfactualsfor

eachcase.Nevertheless,sincemanysocialexperimentsareconcernedwithwelfare

andotherprogramsthatprovideexplicitvariationinemploymentincentivesand

henceusefulinformationonthelikelihoodofcounterfactualoutcomes,itisusefulto

considertherolethattheorycanplayinprovidingboundsontreatmenteffectson

endogenousoutcomes.36


36SimilarapproacheshavebeenpursuedinBlundell,Bozio,andLaroque(2011).

73

Theendogenousoutcomeproblemisofteneasilyanticipatedwhendesigning

anexperiment,asitariseswheneveroutcomeslikewagesorhoursareofinterest

andnon-employmentisarealisticpossibility.Therearevariouswaystoadjustthe

experimentaldesigntofacilitateanalysisofpotentialsampleselectionbias.For

example,supposeinthecaseoftheeffectofatrainingprogramonwagesthe

researcherbelievesthatthereareexogenousfactorsdeterminingaworker’slabor

supplydecision.Ifthesefactorscanbemeasuredexante,therandomizationcould

bestratifiedbythelikelihoodofemploymentaspredictedbytheexogenous

instruments.Stratificationwouldensuresufficientsamplesizesineachexogenous

laborsupplytier.(Ifonlyavailableexpost,say,inafollow-upsurvey,evenabsent

stratificationsuchvariablescanbestillusedasinstrumentsforparticipationif

samplesizesaresufficientlylarge.)

However,asitisusuallydifficulttocomebygoodinstrumentalvariables,the

realpowerofawell-designedRCTwouldbetomanipulatesampleselectiondirectly.

Inthetrainingexample,thiswouldentailaddingasecondsourceofrandomization

thatexplicitlymodifiestheincentivetowork(orthelikelihoodoffindingajob)but

doesnototherwiseaffecttheendogenousoutcome.Whetherthisisfeasibledepends

onthecontext.However,samplesizeconsiderationsneednotbeahurdletoadding

asecondtreatment,sincewithcross-classifiedtreatmentstheadditionofasecond

treatmenthaslittleeffectonthepowerforanalyzingtheeffectsofthefirstin

isolation.Thisapproachisparticularlyusefulifoneisinterestedinexternalvalidity,

sincethetwo-dimensionalexperimentalvariationmayallowonetotraceoutthe

74

treatmenteffectoftrainingforsub-populationswithdifferentemployment

probabilities.

Inthecaseofnon-randomattrition,aversionofthisapproachwouldbeto

randomlyselectagroupofparticipantstofollowupmoreintensively,perhaps

stratifiedwithingroupswithdifferentex-anteattritionprobabilities.Thecontrast

betweenmeanoutcomesinthissubgroupandforotherparticipants(again,perhaps

withinstrata)identifiestheselectivityofattrition,andcanbeusedtoadjustthefull-

sampleestimatedtreatmenteffects.Thisistheapproachpursuedinthefollow-up

wavesoftheMovingToOpportunityexperiment(e.g.,Kling,Liebman,andKatz

2007).Anothersolutionworthpursuingistoobtainadministrativedataforthe

universeofinitialparticipants,includingthosewhohavefailedtorespondtofollow-

upsurveys.Althoughthesedatacanalsobeselected–theytypicallydonotinclude

earningsfrominformaljobs–theselectionisdifferentfromthatcreatedbysurvey

attrition,sothecombinationofsourcescanbevaluable(thoughsometimes

confusing,asintheJobCorpsevaluationdiscussedabove).Sincemergesto

administrativedatacanusuallyonlyconductedonlywithidentifyinginformation

fromthesurveyandpermissionfromparticipants,itisagoodideatofactorthe

needforadditionaldataintotheinitialresearchdesign.

c. Siteandgroupeffects

Inmanycasesanessentialproblemistoidentifythesubpopulationsthat

benefitmostfromaprogram,soastotargetthemfortreatment.However,thereare

oftenmanypossiblesubgroupstoexamine.Whenmanycomparisonsareestimated,

75

thechanceofafalsediscovery–atreatment-controlcontrastthatisstatistically

significant,eventhoughthetruetreatmenteffectiszero–risestowardone.

Avoidingincorrectinferencesinsuchasettingrequirescare.

Aversionofthesubgroupeffectsproblemistoidentifyvariationin

treatmenteffectsacrossprogramlocationsorsites.Suchvariationmightarisefrom

observedlocalcharacteristics–e.g.,treatmenteffectsoftrainingorjobsearch

experimentsmaydependonthetightnessofthelocallabormarket.Wherethe

relevantcharacteristicsofthelabormarketareclearexanteandtheirdimensionis

limited,thisisrelativelystraightforward.Butiftherelevantdimensionsarenot

clearorthenumberofpotentialcontrastsislarge,themultiplecomparisons

problembecomesrelevant.Alternatively,theremightbeunintendedvariationin

treatmentintensityorinthefidelityoreffectivenessoftreatmentdeliveryamong

treatmentsites.Suchsiteeffectsrendertheinterpretationoftheestimated

treatmenteffectoftheoveralltreatmentdifficultandlimitexternalvalidity.Ifthey

arepotentiallyimportant,weneedestimatesofeachsite’sseparateeffect.This

impliesthatthereareasmanytreatmenteffectstobeestimatedastherearesitesat

whichtheexperimentisimplemented.

Aconceptualissueinevaluatingthesuccessofsocialexperimentswithsite

variationistodecidewhethertheparameterofinterestistheeffectoftheprogram

initsmostsuccessfulvariants,withstronglocalpartnersandappropriatelocal

conditions,ortheaverageeffectacrossarangeoflocalcircumstances.Whenthe

latterisofinterest,theidealexperimentaldesignwouldinvolvedrawing

participantsfromallsites.Butthisisoftenimpractical.Morecommonly,social

76

experimentshavebeencarriedoutatoneorafewsites.Theseareoftenchosen

becausethelocalmanagementiswillingtoparticipate,orbecausetheyareseenas

exemplarsoftheprogram.Thismakesitdifficulttointerprettheexperimental

resultsasrepresentativeoftheprogramasawhole(see,e.g.,Hotz1992andAlcott

2015),butmaycomeclosertoidentifyingtheprogrameffectunderclose-to-ideal

circumstances.37


Onitsface,itisstraightforwardtoestimateheterogeneityoftreatment

effectsalongobserveddimensions(e.g.,race,gender,orpastworkexperience)

usingdatafromanalready-completedrandomizedtrial:Onesimplyconstructs

treatment-controlcontrastsseparatelyforeachsubgroup.Manyauthorsemphasize

theimportanceofconductingtherandomizationseparatelyforeachsubgroupof

interest.Thisisnotinprinciplenecessary–unconditionalrandomassignment

ensuresthatassignmentisrandomconditionalonpredeterminedcharacteristicsas

well–butcanaddpowerforsubgroupcomparisons,especiallyinsmallersamples.

Amoreimportantissueisthepotentialnumberofcomparisonstobe

estimated.Ifenoughsubgroupestimatesarecomputed,evenaprogramthathasno

effectonanyonewillbelikelytoshowastatisticallysignificanteffectforsome

subgroup.(Asimilarproblemariseswhenconsideringeffectsonmultiple

37Arelatedbutdistinctproblemisthequestionofensuring“fidelityofimplementation”inanRCT–aclosealignmentbetweentheprogram’sintendeddesignandtheservicesthatareactuallydelivered.Whilethisisimportantformaximizingthestatisticalpoweroftheexperimentandfortestingwhethertheprogram’stheoryofactioniscorrect,itlimitstheexternalvalidityforuseinmakingjudgmentsaboutthelikelyoverallimpactofreal-worldprograms,whichmaynotbeimplementedwithhighfidelity.

77

outcomes.)Researchershavetakenanumberofapproachestothismultiple

comparisonsproblem.Oneistospecifythesubgroupsthatwillbeconsidered,and

thehypothesesofinterest,beforeanalyzingthedata.Thiscanlimitthescopefor

unconsciousdatamining.Italsoensuresthatthenumberofcomparisonsthatwere

consideredisknown,sothatthep-valuesofsimpletreatment-controlcontrastscan

beadjustedforthemultiplicityofthecomparisonsbeingestimated.Anappropriate

adjustmentmakesitpossibletoobtainaccuratep-valuesforthetestofwhetherthe

programhadanyeffectonanysubgroup.Buttwoissuesremain:Thesetests

typicallyhaveverylowpower.Inaddition,evenwhentheydorejecttheyareoften

notabletoidentifywhichsubgroupshavenon-zerotreatmenteffects.Afull

discussionofadjustmentformultiplecomparisonsisbeyondthescopeofthis

chapter,butAnderson(2008)isausefulreference.

Multiplecomparisonsapproachescanbeusefulaswellfortheanalysisof

treatmenteffectsbysiteand/orprovider.Butthequestionsofinterestregarding

siteeffectsarenotgenerallywhethereachsite’seffectisorisnotdifferentfrom

zero,whichiswhatmultiplecomparisonsadjustmentsaredesignedtoanswer,but

ratherthemagnitudeandcorrelatesofvariationintreatmenteffectsacrosssites.

Moreover,thefactthatthesite-specifictreatmenteffectscaninsomesensebeseen

asdrawsfromalargerdistributionopensupnewoptionsforanalysisthatarenot

availableintraditionalstudiesofsubgrouptreatmenteffects.

78

Themid-1990sNationalJobCorpsStudy,discussedabove,illustratessomeof

theissuesinvolved.38Asmentionedpreviously,therandom-assignmentstudy

indicatedthattheprogramhasapositiveaverageeffectonearningsfouryearsafter

participation,ofamagnituderoughlycomparabletothereturntoafullyearof

education(Schochet,Burghardt,andMcConnell2008).(Atthetimeofthe

evaluation,theaverageparticipantwasenrolledforabouteightmonths.)

Butlikeotherjobtrainingprograms,thespecific“treatment”providedtoJob

Corpsparticipantsvariessubstantiallyacrossindividuals,accordingtoperceived

needs.Moreover,JobCorpsservicesaredeliveredat110mostlyresidentialcenters,

themajorityofwhichareoperatedbyprivatecontractors.Someprovidersmaybe

betteratdeliveringaneffectiveprogram(oratguidingparticipantstothetypesof

servicesthattheyneed)thanareothers.Thecenter-specifictreatmenteffectsare

thusofgreatinterest.

TheDepartmentofLabor(DOL)haslongusedaperformancemeasurement

systemtotrackperformanceofthedifferentcentersandinformdecisionsabout

contractrenewal.Performancemeasuresarenon-experimental,andinclude

statisticsliketheGEDattainmentrateoraveragefull-timeemploymentrateof

programparticipantsateachcenter.Butitisnotclearthattheseperformance

indicatorssuccessfullydistinguishcenterimpactsfromdifferencesinthe

populationsservedbythevariouscenters.

38OtherstudiesthatexaminesimilarquestionsareBloom,Hill,andRiccio(2005)andBarnow(2000).Seealsoourdiscussionoftreatmentspilloversabove.

79

SchochetandBurghardt(2008;hereafter“SB”)attempttousetherandom-

assignmentJobCorpsStudytovalidateDOL’sperformanceindicators(seealso

Barnow,2000,whocarriesoutasimilarexerciseforJTPA).Inprinciple,estimation

ofsite-levelcausaleffectsusingtheexperimentisstraightforward:Onesimply

comparesmeanoutcomesofthetreatmentandcontrolgroupsateachsite,relying

ontheoverallrandomassignmenttoensurebalanceofeachsite-levelcomparison.

Butafewchallengesarise.

First,intheJobCorpsStudyrandomizationtookplacebeforeapplicantswere

assignedtocenters.Thus,treatedindividualsareassociatedwithcenters,but

controlindividualsarenot.SBaddressthisbyusingintakecounselors’assessments

ofthecenterthattheapplicantwouldmostlikelyattend,collectedpriorto

randomization.Toensurethattreatmentandcontrolindividualsaretreated

comparably,theyusethispredictionforbothgroups,evenwhenitdiffersfromthe

actualtreatmentassignment.Differencesoccurredforonly7percentoftreatment

groupenrollees,largelybecauseparticipantstendtoenrollintheclosestcenteror

inonethatoffersaparticularvocationalprogram.

Second,evenalargeRCTsample–theJobCorpsStudyincludedover15,000

participants–canhaveverysmallsamplesizesattheindividualsitelevel.Rather

thanestimatecenter-specifictreatmenteffects,SBdividecentersintothreegroups

basedontheirnon-experimentalperformancemeasuresandestimatemean

treatmenteffectsforeachgroup.Interestingly,theyfindthatmeanprogramimpacts

donotdiffersignificantlyacrossgroups,suggestingthattheperformance

measurementsystemisnotsuccessfullyidentifyingvariationincenters’causal

80

impacts.ArelatedexerciseiscarriedoutbyBloom,Hill,andRiccio(2005),whofirst

estimatestatisticallysignificantvariationintreatmenteffectsacross59localoffices

thatparticipatedinthreewelfare-to-workexperiments,thenuseamulti-level

modeltoestimatetherelationshipbetweenofficecharacteristics–mostlyhavingto

dowiththewaythatthetreatmentwasimplementedineachsite,thoughtheyalso

includethelocalunemploymentrate–andoffice-leveltreatmenteffects.Incontrast

totheJobCorpsstudy,theydofindsignificantassociationsofthetreatmenteffect

withboththeirimplementationmeasuresandthelocalunemploymentrate.

Bloom,Hill,andRiccio’s(2005)interestisinidentifyingwhichprogram

featuresaremosteffective.Itisimportanttoemphasize,however,thatthe

associationbetweensite-levelcharacteristicsXjandthesite-specifictreatment

effectτjisobservational,notexperimental,anddoesnotbearastrongcausal

interpretation.Itisquitepossiblethatwhatappears,forexample,tobeastrong

associationbetweentheemphasisthatsitesplaceonquickjobplacementandthe

site-leveltreatmenteffectinsteadreflectsanon-randomdistributionofthis

emphasisacrosssitesthatvaryinotherimportantways.

LiketheJobCorpsstudy,Bloometal.(2005)donotinvestigatevariationin

siteimpactsconditionalonXj.Inmanysettings,thatvariationmightbeof

substantialinterest.Onemightlike,forexample,toestimateeffectsofindividual

sites,ortoaskwhichofanumberofavailableperformancemeasuresdothebestjob

ofpredictingexperimentalimpacts.Thelatterquestionisanaturalonetoask

regardingtheJobCorpsStudy,buttoourknowledgeithasnotbeenpursuedwith

81

experimentaldata(thoughseeBarnesetal.2014forarelatedinvestigationusing

non-experimentaldata).

Muchworkontheestimationofsiteeffectsthemselvescomesoutofefforts

tomeasurehospital,school,orteacherperformance(see,e.g.,Jackson,Rockoff,and

Staiger2014andRothstein2010).Thesestudiesareprogramevaluations,treating

eachsiteorteacherasadistinct“program,”butcannotrelyonrandomassignment

toidentifyprogrameffects.AsintheJobCorpsStudy,therearemanysitesbut

samplesarefrequentlysmallatthesitelevel,so–evenifselectionbiasesareset

aside–site-specifictreatmenteffectestimatesarequitenoisy.Oneconsequenceis

thatactualtreatmenteffectswilltypicallybeclosertotheaveragethanare

estimatedeffects,evenwhentheresearchdesignpermitsunbiasedestimationof

eacheffect.Thus,itiscommonintheseliteraturesto“shrink”theestimated

treatmenteffectstowardthemean.Theproceduregoesbymanydifferentnames–

e.g.,shrinkage,EmpiricalBayes,regularization,partialpooling,multi-levelmodeling

–butthebasicideaisthattheposteriorestimateofasite’seffectequalsaweighted

averageoftheunbiasedestimateofthatsite’seffectandthemeansiteeffect,with

weightsthatdependontheprecisionofthesiteestimate.

Letτjrepresenttheimpactoftheprogramatsitej,andsupposethatacross

sites,τj~N(!,ω2).Supposethatwehaveanoisybutunbiasedestimateofthesitej

effect:tj|τj~N(τj,σ2).Thentheformercanbetreatedasapriordistributionforτj.

ByBayes’Rule,theposteriormeanofτjgiventheobservedestimateis

E[τj|tj]=!+f(tj–!),

where

82

f=ω2/(ω2+σ2)

isthereliabilityratioofthesite-specificeffectestimate.39

Whenthetreatmenteffectvariessystematicallywithsite-levelcovariates–

characteristicseitherofthetreatmentorofthecounterfactual–thiscanbeusedto

improveprecision.Ifthesiteeffectsaremodeledasafunctionofsitecharacteristics,

τj=Xjβ+νj,withνj~N(0,σν2),thenthenoisysite-levelestimatetjshouldbe

shrunkentowardtheconditionalmeanratherthantothegrandmean:

E[τj|tj,Xj]=Xjβ+f’(tj–Xjβ),

wheref’istheconditionalreliabilityratio,f=ω2/(ω2+σν2).Thisissometimes

knowninthestatisticsliteratureas“partialpooling.”

OneuseoftheshrinkageapproachisbyKaneandStaiger(2008),whousea

random-assignmentexperimenttovalidatenon-experimentalestimatesofteachers’

treatmenteffectsontheirstudents.Theyshrinkthenon-experimentalestimates,

undertheassumptionthattheseestimatesarevalid,andaskwhethertheresultis

anunbiasedpredictorofateacher’streatmenteffectsunderrandomassignment.

KaneandStaigerfocuson“value-added”scores,estimatesofteachers’effects

ontheirstudents’testscoresfromobservationalregressions,asthesolenon-

experimentalestimate.Theyfailtorejectthehypothesisthatthesescoresare

unbiasedpredictorsoftheexperimentaleffects,consistentwiththeviewthatthey

areunconfoundedbystudentsorting.Buttheexperimenthasquitelowpowerto

39TheposteriormeanisalsoknownasanEmpiricalBayesestimate.Itisanunbiasedpredictorofthetruesite-leveltreatmenteffectτjifthesite-specificestimatestiareunbiasedestimates(Rothstein2016).

83

distinguishalternativeexplanations,andRothstein(2016)arguesthatthequestion

remainsunresolved.40

Angristetal.(2015)exploretheoptimalcombinationofexperimental

estimateswithpotentiallybiasedbutmoreprecisenon-experimentalestimatesto

obtainminimummean-squared-errorpredictionsofschools’treatmenteffects.A

relatedquestioniswhethernon-experimentalmeasuresofotherparameters(e.g.,

classroomobservations)canimprovethepredictionofexperimentaleffects.Ifso,

onemightwanttouseaweightedaverageoftheavailablemeasures,weightedto

bestpredicttheexperimentaltreatmenteffect,forperformancemeasurement

purposes.Toourknowledge,nostudyhasattemptedtoestimatetheseweightsinan

experimentalsetting(thoughseeMihalyetal.,2013,foranon-experimental

analysis).


Ultimately,smallsamplesizeshavelimitedanalysts’abilitytoidentifysite-

orgroup-levelvariationintreatmenteffects.Buttheremaybewaystodesign

experimentstobettersupporttheseinvestigations.Mostobviously,resourcescan

beputintocollectingdataonvariationinthequantityandtypesoftreatments

delivered,tosupportanalyses(likethatofSchochetandBurghardt2008orBloom

etal.2005)ofhowsitetreatmenteffectsvarywithobservablemeasuresofsite

treatmentvariation.Large-scaleprogramevaluationsoftenincludeimplementation

analysesalongsiderandomizedimpactevaluations,andifthesetwoportionswere

40Formoreonthetopicofteachervalue-added,seeChetty,Friedman,andRockoff(2014)andRothstein(2016).

84

closelyintegratedtheresultsoftheimplementationstudycouldbeusedtoinform

ananalysisofsiteeffectsintheimpactevaluationsample.Powercanalsobe

improvedbyconductingrandomizationwithinsite-levelstrataandbyminimizing

non-compliancerates(andcarefullymeasuringtreatmentsactuallyreceived).

d. Treatmenteffectheterogeneityandexternalvalidity

Theempiricalliteratureonprogramevaluationhasbeenincreasinglyaware

oftheimportanceofpotentialheterogeneityintreatmenteffectsforinterpreting

estimatesofprogramimpactsandassessingtheirexternalvalidity.Manyevaluation

samplesaredrawnfromspecificpopulations–individualsinparticularregionsor

cities,individualsenteringaprograminacertainway,orindividualsthought

suitableforaproposedalternativeprogram.Iftreatmenteffectsvary,generalizing

fromthesesamplestoabroaderpopulationishazardous.Anothervariantofthe

externalvalidityproblemariseswhenthecompliancerateintheexperimental

samplediffersfromwhatwouldbeexpectedoutsidetheexperiment,asthe

experimentalLATEmaynotcorrespondtoanappropriatecomplierpopulationfor

theprogramevaluationofinterest.

Thereareseveralpotentialsourcesofheterogeneity.Intheprevioussection,

wehavediscusseddifferencesincharacteristicsoftheenvironment(suchasstateof

thelabormarket,includingbusinesscycleandindustryoroccupationstructure,

populationdensity,orlabormarketdiscrimination),differencesinaspectsofthe

program(suchasunintendeddifferencesintheintensityoftreatment,something

weaddressundersiteeffects).Inthissection,wefocusonthecasewheretreatment

85

effectsvarybecauseofdifferencesincharacteristicsattheindividuallevel(suchas

preferences,abilities,health,beliefs,resources,familyenvironment,oraccessto

networks).BelowandinSectionIV.f,wealsodiscussvariationtreatmenteffects

arisingbecauseofvariationinstructuralaspectsoftheprogram,suchasdifferences

inworkincentives.


Theliteratureisbroadlyinagreementonhowtodealwithheterogeneityin

treatmenteffectsbyobservablecharacteristicsofstudyparticipants.Asdiscussedin

SectionIV.c,theexperimentaldesignimpliesthatonecanobtainconsistent

estimatesofthetreatmentimpactforeachsubgroup,subjecttohavingsufficiently

largesamplesizes.OnecanthenextrapolatetheTOTandATEtosettingswithother

distributionsofobservablecharacteristicsbyconstructingappropriatelyweighted

averagesofsubgroupeffectsandcorrespondingstandarderrors.Asamore

commonalternative,onecandirectlyestimateTOTandATEforanotherpopulation

byreweightingtheoriginalsampletomatchthedistributionofobservable

characteristicsofthetargetpopulation(e.g.,DiNardo,Fortin,andLemieux1996).If

multipletreatmentsitesareavailable,inprincipleasimilarapproachcanbeusedto

assesstheeffectofenvironmentalcharacteristics,suchaslabormarketconditions

orindustrialstructure.

Thecaseofheterogeneitybyunobservedcharacteristicshaspresented

greaterchallenges.Unfortunately,theindividual-leveltreatmenteffectisgenerally

notidentifiedeitherbyexperimentalnornon-experimentalmethods.Evenwith

perfectcompliance,anexperimentidentifiesonlytheaveragetreatmenteffect

86

conditionalonobservedcharacteristics.

Somearguethataveragetreatmenteffectsaresufficientformostpurposes,

aswecareonlyaboutthedistributionsofoutcomesunderalternativepoliciesand

notaboutthepositionsofparticularindividualswithinthosedistributions.Thisisa

controversialclaim,however–inmanycontexts,aprogramthathelpedsome

individualsbuthurtothersbyanequalamount,withzeroaverageeffect,wouldbe

judgedworsethannothing.

Moreover,averageeffectsmaynotbegeneralizablebeyondthepopulation

(withperfectcompliance,experimentalparticipants,orwithimperfectcompliance,

thesubgroupofcompliers)identifiedbyanexperiment.Withheterogeneous

treatmenteffects,neithertheTOTnorthecomplierLATEmayberelevantforother

populationsofinterest.Akeyquestionthenishowrepresentativetheexperimental

compliersareofthegroupofpeoplethatwouldbepotentiallyaffectedbythe

programinquestion.Inmanycasestheprogramcompliersarelikelytobesimilarto

thepopulationofinterest,inwhichcasethecomplierLATEislikelytoapproximate

therelevantparameter.Inothercases–forexamplewhencomplianceislikelyto

differbetweenthestudyandtheprogramatscale–theestimatedLATEfromone

programevaluationmaybelessuseful.

HeckmanandVytlacil(2005)proposeaconceptualframeworktoanalyze

heterogeneityintreatmenteffectsthatreliesontheconceptofthemarginal

treatmenteffect(MTE).Ifτidenotestheindividualtreatmenteffect,Xiisavectorof

observedindividualcharacteristics,andviistheerrorintheequationdetermining

takeupoftreatment,thenthemarginaltreatmenteffectisdefinedasE[τi|Xi=x,

87

vi=v];ofinterestishowthisvarieswithv.Thisstructureprovidesaframeworkfor

consideringexternalvalidity.ThetraditionalLATEobtainedfromanalysesof

experimentswithnoncompliancecanbeseenastheintegraloftheMTEovera

particularrangeofv,butproposalstoexpandorrollbackprogramsmayimplicate

MTEsatothervvalues.

TomovebeyondtheLATE,werequireamulti-valuedinstrumentthatcan

mapoutthefulldistributionofv(or,equivalently,thefullrangeofPr(T=1|X)).If

suchaninstrumentisavailable,theMTEcanbeobtainedbyanon-parametric

regressionoftheoutcomeonthefittedprobabilityofprogramparticipation

resultingfromthefirststageequation.41

ThisisnotpossibleinthecaseofasimpleRCT.However,whentheRCT

implementedatmultiplesites,andifoneiswillingtoassumethatheterogeneityof

siteeffectsislimitedtocompliancerateswithnovariationineffectsontheoutcome,

onecanexaminetherelationshipbetweenthesite-specificcompliancerateandthe

site-specificestimatedtreatmenteffect(i.e.,thesite-specificLATE).42(Alternatively,

onecoulddirectlyregressthesite-specifictreatmenteffectontheestimated

probabilityoftakeupandobtaintheMTEfordifferentcompliancerates.)This

relationshipcouldinprinciplebeusedtoforecastthelocalaveragetreatmenteffect

41Manyotherrelevantparameters,includingLATEandATE,canbeexpressedasfunctionsoftheMTE.However,toestimatetheATEortheTOT,say,oneneedstoobtaintheMTEforeachvalueofXforthefullrangeofcomplierprobabilities,i.e.,from0to1.Whileinmanycasesthismaybeinfeasibleduetodatalimitations,ifavailablethiscouldbeusedtoextrapolatetheATEorTOTforpopulationswithdifferentcomplianceratesanddistributionofcharacteristics.42NotethattheweightingfunctionoftheLATEestimatorformulti-valuedinstrumentsinAngristandImbens(1995)isproportionaltothedifferencesintakeupprobabilitiesbetweendifferentvaluesoftheinstrument(orderedbythevalues’impactontakeup).Thisdifferencecanbeinterpretedasthedifferenceincompliancebetweeninstrumentvalues.

88

atapotentialalternativetreatmentsite(possiblyreweightingtoadjustfor

differencesinobservablecharacteristics),givenaforecastofthenewsite’s

compliancerate.Moregenerally,thisapproachwouldallowinferringtheeffectof

anyinterventionaffectingthecostofcomplianceandhencethecompliancerate

itself.

Attimesitisusefultogofurther,toestimatingthefulldistributionof

treatmenteffects.Theabovemethodwillnotaccomplishthis.Heckman,Smith,and

Clements(1997)showthatwithoutadditionalassumptions,experimentaldatais

essentiallyuninformativeaboutthetreatmenteffectsdistribution.Moreover,they

demonstratethatquitestrongassumptionsonthedependenceofcounterfactual

outcomesinthecontrolandtreatmentstatesareneededtoobtainplausible

estimatesofthedistributionoftheeffectoftraininginthecontextoftheNational

JobTrainingPartnershipAct(JTPA)study.Nevertheless,asmentionedattheoutset,

knowledgeofthedistributionofheterogeneoustreatmenteffectsisundoubtedly

importantinassessingtheimpactofaparticularprogram.(thoughitisless

straightforwardhowsuchinformationcanbeusedtoaddresstheissueofexternal

validityiftreatmenteffectsvarypurelywithunobservedcharacteristics).

Oneapproachthathasbeenusedtomakeinferencesaboutheterogeneityin

treatmenteffectsisestimationofquantiletreatmenteffects(QTE).Asdiscussedin

SectionIV.a,theQTEfortheq-thquantileisdefinedasthedifferenceintheq-th

quantileoftheoutcomedistributioninthetreatmentandcontrolgroups,

respectively.Itisclearthatabsentstrongassumptions,suchasrankstability,QTEs

donotrecoverthedistributionoftreatmenteffects(thoughtheydorecoverthe

89

effectofthetreatmentontheoutcomedistribution,whichmaybesufficientfor

manypurposes;seeAtheyandImbens,thisvolume).Yet,itcanbeahelpfuland

easy-to-implementdiagnosticdeviceinatleasttwosenses.First,aQTEanalysiscan

beusedtotesttheassumptionofconstanttreatmenteffects,whichwouldimplythat

theQTEisequalatallquantiles.Second,insomecasesparticularfeaturesofa

programallowonetoderivepredictionsastoresponsesindifferentquantilesofthe

outcomedistribution(seebelow).Moregenerally,QTEmayprovideabroad

descriptivesenseofpotentialtreatmentresponses.

Onesourceoftreatmenteffectheterogeneityisdifferencesinthestructureof

theprogramtobeevaluated.Inthiscase,theorymayprovideweakassumptions

thatallowmakinginferenceonthedistributionoftreatmenteffects.Welfare

programsrepresentagoodexample,sincetheyusuallycombinearangeofdifferent

laborsupplyincentivesarisingamongothersfromwelfarepayments,earnings

disregards,implicittaxrates,orphase-outregions.Clearly,theseincentivesinteract

locallywithindividualheterogeneityinpreferencesorability,somethingwewill

returntobelow.Buttheadditionalstructurecanmakeformorenaturalidentifying

restrictionsthaninthecaseofaprogramthatisatleastintendedtobeuniform,

suchasatrainingcourse.Aseriesofpapershasaddressedthisquestioninthe

contextofevaluationofConnecticut’swelfare-to-workprogram,JobsFirst,against

thethen-prevailingalternativewelfareprogram.Forexample,toassessthedegree

ofheterogeneityintreatmentresponsesBitler,Gelbach,andHoynes(2006)

implementaQTEanalysisasdescribedabove,andrelatetheresultingestimatesto

predictionfromastandardlaborsupplymodel.TheKlineandTartari(2016)study

90

discussedabove,aimedatboundingtransitionprobabilitiesbetweencounterfactual

states,takesadvantageofacross-participantobservabledifferencesinthenatureof

thedecisionproblemfacedtoconstructrevealed-preferencerestrictionsontheset

ofpotentialtransitions.Thisisanimportantdiagnosticdeviceforassessingthe

rangeofcounterfactualtreatmentresponsestotheprogramitself.Asdiscussed

above,apotentialdrawbackisthattheprocedureisrathercomplexandonlyapplies

totheparticularprogramstudied.Onealsohastocontendwithpossiblywide

bounds.

Inprinciple,KlineandTartari’sapproachcanalsobeusedforpredictingthe

effectonthedistributionofmarginaloutcomesofmovingfromtraditionalwelfare

toawelfare-to-workprogramofthesamestructureatanothersite(seeSection

III.a).Yet,itisworthkeepinginmindthattheestimatedboundshavetheLATE

property,i.e.,theymaydependontheparticulardistributionofindividual

characteristicsandthelocalenvironment.Extrapolatingtodifferentpopulationsor

environmentsintheircontextwouldrequireimposingadditionalassumptionson

theunderlyingstaticlaborsupplymodel,andthustradeoffadditionalpredictions

withrobustness.


Theremaybeanopportunitytomakemoreprogressonthistypeof

treatmenteffectheterogeneitybybuildingitintotherandomizationdesign.Cross-

classifiedandmultipletreatmentgroupexperimentscanbequitehelpfulfor

identifyingvariationintreatmenteffects.

Insomecases,wearedirectlyinterestedinunderstandingthedistributionof

91

treatmenteffects.Whenaplausiblestructuralmodel(perhapssomethingassimple

asaHeckman-Vytlacil(2005)Roymodel)isavailable,onemightusethestructural

modeltopredictindividualtreatmenteffects,thenstratifytheexperimentbasedon

thesepredictions.TheNITstudiescanbeseenasaversionofthis,asthesewere

stratifiedbasedonpriorearnings,apotentiallystrongpredictorofthetreatment

effect.

Inothercases,concernsaboutheterogeneityaredrivenbypotential

differencesbetweenthecomplierLATEandthepopulationATE.Ratherthansimply

assigningparticipantstobeofferedornotofferedthetreatment,onemightalso

varytheextentofeffortstoenforcecompliancewiththeexperimentalassignment.

Whentherelevantselectionisthoughttobebasedinpartontheanticipated

individualtreatmenteffect,asinHeckmanandVytlacil(2005),onecanidentifythe

MTEcurvedirectlybyrandomlyassigningparticipantstomultiplevaluesofthe

incentive(orcost)toobtainthetreatment.

Whichoftheseisappropriatedependsonthenatureoftheselectioninto

complianceintheexperiment,andhowitrelatestowhatwouldbeobservedina

non-experimentalsetting.Tomakethingsconcrete,wewillconsiderastudyin

whichapplicantsarerandomlyassignedtobeeligibleorineligibletoreceive

trainingofferedataparticularjob-trainingcenter.Onemightexpectthatnon-

compliancerateswillbelowforthoseassignedtothetreatmentgroupforwhomit

isinconvenienttotraveltotheprogramsite.OnemightthenexpecttheLATEto

varywithtravelcosts,butinasimpleexperimentthereisnowaytoestimatehow

muchofthisisduetodifferencesinaveragetreatmenteffectsbetweenthosewho

92

liveclosetoandfarawayfromtheprogramsiteandhowmuchtodifferencesin

selectionintothecompliergroup.

Onewaytolearnaboutthiswouldbetoimplementamorecomplex,multiple

treatmentarmexperimentinwhichasubsetofindividualsofferedaccesstothe

trainingarealsoofferedtransportationtothetrainingsite.Ifthedistance-treatment

effectcurvesdifferbetweenthetwotreatmentarms,onecanconcludethatselection

intoparticipationisimportant,andthiscanthenbeused(withaparametric

selectionmodel)toestimatehowtheLATEforasimilarly-selectedcomplier

populationvarieswithdistance.Thismaybeimportantifthegoalistogeneralize

fromtheexperimenttoascaled-upprogramthatwouldoffertrainingatawider

numberofsites.

Onecanalsousethethree-armexperimenttoidentifytheMTEcurve,but

onlywithstrongrestrictionsontheshapeofthiscurve(whichcorrespondtostrong

parametricassumptionsabouttheselectionprocess;seeBrinch,Mogstad,and

Wiswallforthcoming).Theserestrictionsmaybeunattractive.Ifanimportantgoal

ofthestudyistounderstandhowtreatmenteffectsvarywiththecostsof

participation,anevenmorecomplexexperimentaldesignmightbecalledfor.

Ratherthanassigningindividualstoatreatmentgroupthatreceivestrainingatzero

costoracontrolgroupthatisdeniedaccesstotrainingatanyprice,onemightuse

multiplegroupsthatareofferedtrainingatdifferentpricepoints(including

potentiallynegativeprices).Variationinoutcomesacrossthesegroupswilltrace

outseveralpointsontheMTEcurveandcanbeusedtoidentifyamoreflexibly

shapedcurveunderweakerassumptions.

93

Cross-classifiedandmultipletreatmentarmexperimentsraiseanumberof

practicalissuesthatarenotconfrontedinclassicaltreatment/controlstudies.First,

allocatingobservationsacrossmanyarmsreducespowertodetectdifferencesin

outcomesbetweenanypairoftreatments.Researchersdesigningexperimentsmust

thereforetradeoffthebenefitsofamultiple-treatment-armexperimentagainst

reducedabilitytodetectparticularpairwisecontrasts.Thisissuecansometimesbe

addressed,however,whenthealternativearmscanbeseenasvaryingthedosageof

asinglewell-definedtreatment.Anexperimentwherealltreatedindividualsare

assignedatreatmentdoseof1giveslesspowerforidentifyingalineardose-

responserelationshipthanonewherethesameindividualsareassignedvarying

doseswithameanof1(forexample,whenhalfareassignedadoseof0.5andhalf

areassigned1.5);moreover,thelatterdesignprovidesatleastthechanceof

detectingnonlineareffects.

Cross-classifiedexperiments,withafractionpassignedtotreatmentAanda

fractionqindependentlyassignedtotreatmentB,canalsobeseenassacrificing

power,thoughagaintherealityismorecomplex.Letyabirepresentthepotential

outcomeforindividualiwhentheprogramAassignmentisa(a=0or1)andthe

programBassignmentisb.ThetraditionalestimandforevaluationofprogramAis

E[y10i–y00i].Only(1-q)NoftheNobservationsinthecross-classifiedexperiment

canbeusedforestimatingthisquantity,astheotherqNobservationsareassigned

toreceivetreatmentB.Buttheexperimenthasfullpowerforestimatingthe

alternativetreatmenteffectE[((1-q)y10i+qy11i)–((1-q)y00i+qy01i)].Thiscanbe

seenasaweightedaverageoftwotreatmenteffectsofprogramA,onethatapplies

94

toindividualswhoalsoreceiveprogramBandoneforthosewhodonot.Insome

cases,thismaybeofmoreinterestthanthetraditionalestimand–e.g.,whenthe

scaled-upversionofprogramAwillcoexistwithprogramB.

e. Hiddentreatments

Along-standingissueintheinterpretationofjobtrainingprogram

evaluationsisthattheseevaluationscommonlyhavesubstantialratesofnon-

complianceandcrossovers.Manypeopleassignedtoreceivetrainingdonot

completetheircourses,andithasbeenoperationallyandpoliticallydifficultto

excludepeopleassignedtothecontrolgroupfromreceivingtreatment,eitherfrom

thesameproviderthatservesthetreatmentgrouporfromanalternativeprovider.

Indeed,insomecases,ethicalconcernsledtodecisionstoactivelyinformcontrol

groupindividualsaboutalternativesourcesoftraining.

Muchoftheliteraturetreatsthisasnon-complianceofthetypediscussedin

SectionII.b.ii,soestimatesthetrainingeffectbydividingtheITTeffectbyan

estimateofthecompliershare(see,e.g.,Heckman,Hohmann,Smith,andKoo,2000).

Butthisisunsatisfactorywhenthecontrolgroupnon-compliersreceiveadifferent

treatment–e.g.,trainingfromadifferentprovider–fromthatgiventothe

treatmentgroup.Intechnicalterms,thisisaviolationofSUTVA;practically,it

meansthatassignmenttotreatmentmayaffectoutcomesevenforthealways-takers

whoreceive(sometypeof)traininginanycase.Toourknowledge,thisissuehas

notbeenaddressedintheenormousliteratureonjobtrainingexperiments.

(Heckmanetal.,2000,notetheissue,buttheiranalysesfocusonnon-random

95

selectionintotrainingandheterogeneityoftrainingeffects,whicharerelatedbut

distinctissues.)

EventheIVapproach,unsatisfactoryasitis,isoftennotfeasible:Itrequires

measuringtheshareofthecontrolgroupthatcrossesover.Inmanycases,thisisnot

available:Theexperimentaldataincludesinformationonthereceiptofservices

fromtheprogramunderstudybutnotonservicesobtainedfromothersources.In

thiscase,onlyintention-to-treat(ITT)estimatescanbecomputed.Buttheseare

attenuatedbythefailuretomeasurethe“hidden”alternativetreatments.


AveryrecentliteraturetakesupthistopicinthecontextoftheHeadStart

pre-schoolprogram.TheHeadStartImpactStudyrandomlyassignedHeadStart

applicantstobeofferedcareorturnedaway.Manyofthecontrolgroupapplicants

(andasmallershareofthetreatmentgroup)woundupreceivingalternativecenter-

basedchildcarethatisthoughttobelesseffectivebutmaybeapartialsubstitute.

WheretraditionalIVestimatorstreatthisasequivalenteithertotheHeadStart

treatmentortothereceiptofnoservices,itmightbemoreappropriatetotreatitas

adistinct,“hidden”treatment.

Walters(2014)estimatesheterogeneityintheHeadStarteffectacross

centers(sites),finding(amongotherresults)thattheLATEofHeadStart

participationissmallerwhenmoreofthecompliergroupisdrawnfromother

centersratherthanhome-basedcare.Thisissuggestivethatothercenter-basedcare

isdistinctfromhome-basedcare.

96

KlineandWalters(2014)explicitlymodelthehiddenalternativecenter

treatment,usingvariationinthecompliancepatternsacrossparticipants’

observablecharacteristics(e.g.,parentaleducation)toidentifyamultinomial

variantofaHeckman(1979)parametricselectioncorrectionandthusobtain

partiallyexperimentalestimatesoftheseparateeffectsofthetwotypesofchild

care.Theirapproachleveragesvariationacrossobservablecharacteristics(X)inthe

shareofexperimentalcomplierswhoaredrawnfromalternativecentercare,

togetherwithautility-maximizingchoicemodelthatconstrainshowselectionon

unobservablesvarieswithX.Withtherestrictionsimposedbythismodel,theyfind

largeeffectsofHeadStartrelativetohome-basedcare.AstheHeadStartexperiment

didnotdirectlymanipulatethechoicebetweenhome-basedandothercentercare,

theyarenotabletoestimatetherelativeeffectofthesewithanyprecisionintheir

leastrestrictivemodel,thoughpointestimatesareconsistentwithaneffectofother

centerscomparabletothatofHeadStart.WhenKlineandWaltersimposestronger

restrictionsontheselectionprocess,theyobtainsimilarpointestimatesbutwith

moreprecision.

Felleretal.(2014)alsoexaminethehiddentreatmentsissueintheHead

StartImpactStudysample.Theyuseaprincipalpost-stratificationapproachthat,

likeKlineandWalters,exploitsvariationacrossobservablesinselectionintothe

twotreatments.Theycouplethistoafinitemixturemodelingstrategythattreats

theseparationofthetwocompliersubgroupdistributionsasadeconvolution

exercise.Parametricassumptionsaboutthesedistributionsareusedtoidentifythe

localaveragetreatmenteffectsofthetwotreatments.ResultsaresimilartoKline

97

andWalters:HeadStarthaspositiveeffectsonthosewhowouldotherwisebeat

home,butlittleeffectonthosewhowouldotherwisereceivealternativecenter-

basedcare.

AnotherexampleoftheanalysisofhiddentreatmentsisPinto’s(2015)

analysisoftheMovingtoOpportunityexperiment.Inoneview,theMTOstudy

involvedtwotreatmentarms:Oneofferedahousingvoucherthatcouldbeused

anywhere,andtheotherrestrictedthevouchertoalow-povertyneighborhood.

StraightforwardexperimentalcomparisonsidentifytheITTandLATEofusageof

eachtypeofvoucher.Inanotherview,however,therelevanttreatmentisthetypeof

neighborhoodinwhichtheparticipantlives.Kling,Liebman,andKatz(2007)use

variationacrossthetwotreatmentarmsandacrosssitestoidentifyeffectsof

neighborhoodpoverty(underrestrictionsontreatmenteffectheterogeneity).Pinto

(2015)addsmorestructure,usingrevealedpreferencerestrictions–anyoneoffered

anunrestrictedvoucherwhomovestoalow-povertyneighborhoodcanbeassumed

tochoosethesametypeofneighborhoodinthecounterfactualwhereshereceivesa

restrictedvoucher–toidentifyparametersofinterestconcerningthedistributionof

neighborhood-typetreatmenteffects.43


ThePinto(2015)studytakesadvantageofthemultiple-treatmentarmsin

theMTOexperiment,whiletheHeadStartpapersdiscussedaboveexploit,in

43Pinto’sanalysisassumesthatthesetofneighborhoodsinwhichavouchercanbeusedistheonlyrelevantdifferencebetweenthetwotreatmentarms.ButinMTOlow-povertyvoucherrecipientswerealsoofferedcounselingthatmayhavehadindependentimpactsonneighborhoodchoiceorevenonoutcomes.

98

variousways,theuseofcentersasstratainthatexperiment.Thissuggests,

correctly,thatcomplexexperimentaldesignsmaybeusefulinresolvinghidden

treatmentproblems,andthataresearcherinterestedintheseproblemsmightbe

abletodesignanexperimentwiththeminmind.Intheneighborhoodeffects

example,onemightwanttohaveseveraltreatmentarmsthatvaryinthe

restrictionstheyplaceonneighborhoodchoice;forHeadStart,onemightexplorea

thirdtreatmentarmthatprovidesavoucherusableeitherataHeadStartcenteror

atanalternativecenter.Thisdesignmightalsobeusefulforajobtraining

evaluation.

Ineachofthesecases,itiscrucialtocollectinformationaboutthetypeand

amountoftreatmentthateachparticipantactuallyreceives;withoutthis,the

complexexperimentaldesignsareoflittlevalue.

f. Mechanismsandmultipletreatments

ThehistoryinSectionIIImakesclearthatmanylabormarketexperiments

involvevariationinmorethanoneaspectofagivenprogram.Thisisclearlythecase

whenprogramsconsistingofsuitesofservicesandincentivesareevaluated,suchas

inrandomizedevaluationsofwelfare-to-workprogramsoroflarge-scaletraining

programswitharangeofintegratedservicessuchasJTPAorJobCorps.Yet,even

theinterpretationofmanyRCTsofsmallertrainingprogramsismadedifficultby

thefactthatsomeformofjobsearchassistanceisprovided.SimpleRCTsdonot

identifywhichofthecomponentsofthetreatmentareresponsiblefortheimpact.

Learningaboutsuchmechanisms,besidesbeingofinterestinitsownright,is

99

particularlydesirableifonewishestoextrapolatetonewprogramsorlearnabout

underlyingbehavioralparameters.Thisisforexamplerecognizedexplicitlyinthe

ongoingevaluationoftheREAprogramdiscussedinSectionIII,whichaims

explicitlyatdistinguishingtheeffectofa‘hassle’duetobeingsummonedtoappear

fromtheactualjobsearchassistanceprovided.

Evenwhenthetreatmenthasonlyonecomponent,inmanycasesthat

componentissufficientlycomplexthattheaveragetreatmenteffectisnotenough–

wewanttounderstandtheunderlyingmechanism.Thesimplestexampleofthisis

laborsupplyexperiments,forwhichitisoftenimportanttodistinguishincomeand

substitutioneffects.Italsoarisesinmanyofthewelfarereformprograms,which

cancreatecomplexchangesinintertemporalbudgetconstraintsduetotimelimits

oreligibilityeffects.


Researchershaveusedanumberofstrategiestoextractfromexperimental

dataevidenceonthemechanismsunderlyingthetreatmenteffectsidentifiedbythe

experiment.Inthesimplestcase,itissometimespossibletouseexperimental

variationtodistinguishtherelevantmechanisms,withonlyminimalrestrictions

derivedfromtheory.Thisismostfeasiblewhentheexperimentinvolvesmorethan

twogroups.Thefirstlarge-scalesocialexperiments,theNegativeIncomeTax

studies,wereusedinthisway.The“treatment”herewasataxscheduledescribed

bytwoparameters:Thetransferreceivedifearningswerezeroandthetaxrate

appliedtoanyearnings.Themainoutcomewaslaborsupply,andakeyconcernof

thesestudieswastodistinguishincomefromsubstitutioneffects.

100

Withasingletreatmentarmandasinglecontrolgroup,thiswouldnotbe

possible:Theneteffectofthetreatmentwouldbeidentified,buttherewouldbeno

wayofdistinguishingsubstitutionfromincomeeffects.(Oneexceptionwouldbeif

thetreatmentweredesignedtobeafullycompensatedchangeinthemarginaltax

rate–thiswouldhavenoincomeeffect,sothetreatmenteffectwouldequalthe

substitutioneffect.ButtheNITtreatmentswerenotdesignedthisway.)With

multipletreatmentsthatvaryboththebasetransferandthemarginaltaxrate,and

withanassumptionthatbothincomeandsubstitutioneffectsarelinearinthe

relevanttaxvariable,thetwoeffectscanbeestimatedseparately.

Toseethis,supposealaborsupplyfunctionthatrelateshoursofwork(H)to

thewagerate(w),non-laborincome(N),themarginaltaxrate(r),andotherfactors

suchaspreferencesforleisure(e):

H=f(w,N,r,e).

Forsimplicityofexposition,weassumeaconstantmarginaltaxrate,though

thisisnotcrucial(seeHausman1985).Amorerestrictiveassumptionisthatthe

individuallaborsupplyfunctionislinearandadditivelyseparableinnon-labor

incomeandthenet-of-taxhourlywage:

Hi=γi+wi(1-ri)δi+Niη.

Nowconsiderasimpleexperimentthatassignssomeindividualstoacontrol

groupwhereriandNiarenotmanipulated,andotherstoatreatmentgroupthat

receivesanadditionalbaselinetransferDandfacesanincrementtothetaxratet.

Then,adoptingtheearlierpotentialoutcomesframework,eachindividualhastwo

potentialoutcomes:

101

Hi0=γi+wi(1-ri)δi+Niηiand

Hi1=γi+wi(1-ri-t)δi+(Ni+D)ηi.

Withrandomassignment,thedifferenceinmeanlaborsupplybetween

treatmentandcontrolgroupsequals

E[Hi|Di=1]–E[Hi|Di=0]=-tE[wiδi]+DE[ηi].

Thefirsttermhererepresentssubstitutioneffects,whilethesecond

representsincomeeffects.Butthesimpleexperimentidentifiesonlythe

combinationofthem.

Fortunately,theNITstudiesinvolvedmultipletreatmentarms,withvarious

combinationsoftransfersandtaxrates.Considerasimpleextensionoftheabove

structure,withtwotreatmentgroups1and2andassociatedparameters{D1,t1}and

{D2,t2}.Noweachindividualhasthreepotentialoutcomesassociatedwith

assignmenttothecontrolgroupandeachofthetreatmentgroups,H0,H1,andH2.

Twodistincttreatment-controlcontrastscanbecomputed:

E[Hi|Di=1]–E[Hi|Di=0]=-t1E[wiδi]+D1E[ηi]and

E[Hi|Di=2]–E[Hi|Di=0]=-t2E[wiδi]+D2E[ηi].

Thisisasystemoftwolinearequationsandtwounknowns.Solongasthesystem

hasfullrank–here,aslongas(D1/D2≠t1/t2)–itcanbesolvedforthemean

incomeelasticityoflaborsupply,E[ηi],andforE[wiδi].Thelattercanbedivided

bythemeanwagerate,E[wi],toobtainawage-rate-weightedmeansubstitution

elasticity.(Withalargeenoughsample,themeansubstitutionelasticity,E[δi],could

beidentifiedbystratifyingthetreatment-controlcomparisonbythewagerate.)

102

AnumberofstudiesusedtheNITexperimentdatatoestimatethe

parametersofthelaborsupplyfunctioninbasicallythisway,accountingfor

additionalcomplicationsthatweneglecthere(e.g.,participationdecisions,non-

lineartaxschedules,etc.)andoftenusingmorecomplexlaborsupplyfunctions.See,

e.g.,Moffitt(1979).Butthiswasbynomeansuniversal:Inthelate1970s,the

experimentalparadigmwasnotaswelldeveloped,andmanyofthestudiesthat

usedtheexperimentaldatadidnotrelysolelyontherandomlyassigned

componentsofnon-laborincomeandtaxratesforidentification(e.g.,Keeleyetal.,

1978).

Intheabovesimplemodelthemeanincomeandlaborsupplyelasticitiesare

justidentifiedwithtwotreatmentarms.Withmorethantwoarms–the

Seattle/Denverexperimentalonehad11–themodelisover-identified.Thisopens

thepossibilityofperformingover-identificationtestsoftherestrictionsimposed

whenspecifyingthelaborsupplyfunction.AshenfelterandPlant(1990)estimate

separatetreatmenteffectsofeachtreatmentarm,butwearenotawareofstudies

thatinvestigateformallywhetherthepatternofeffectsisconsistentwithaposited

laborsupplyfunction.

Evenabsentmultipletreatmentarms,sometimesstatisticalortheoretical

modelsandassumptionscanenableresearcherstolearnaboutmechanismsthat

generateaprogrameffect.Forexample,CardandHyslop(2005)[henceforthCH]

analyzethedatafromtheCanadianSelfSufficiencyProgram(SSP)RCT.SSP,a

welfare-to-workprogram,combinedastrong,temporaryworkincentivefor

participatingworkerswithafixedinitialtimeperiodduringwhichwelfare

103

recipientshadtoestablisheligibilityintheprogrambyworkingfulltime.Asaresult

ofthistwo-tieredstructure,thesimpleexperimentanalysisdoesnotdistinguishthe

effectsofthevariouscomponentsoftheprogram.Thismakesitdifficulttocompare

theeffectsofSSPwithotherwelfare-to-workprograms,toassesshowSSPworked,

andtodrawlessonsforsimilarprograms.CHuseaparametricstatisticalmodelto

separatelyidentifytheeffectofthedifferentincentivesinherentintheSSPprogram.

Incontrasttostaticevaluationsofwelfare-to-workprograms,CHfocusonthe

dynamiclaborsupplyincentivesinherentintheprogram.

Onecannotdirectlyanalyzetheeffectofthesubsidy(whichinthefollowing

wewillrefertoastheSSPprogram)forthosewhobecameeligiblebecauseof

selectionintheeligibilitydecision.Onecan,however,modeleligibilityasatypeof

imperfectcompliance,permittingtheestimationoftheLATEofSSPontotal

employmentoronthefractionemployedatanygivenpointintime.Whenoneturns

todynamicanalyses,potentialdifferentialchangesinthenatureofselectioninthe

treatmentandcontrolgroupsmakeitimpossibletoestimatethedynamicresponses

ofhazardratesorwagesjustbasedontheRCT.44Inaddition,asinotherwelfare

evaluations,endogenousemploymentdecisionsmakeananalysisofwageoutcomes

problematic.Anotherissueisthatintheshortrunthestrongworkincentivearising

fromtheoptionvalueintheeligibilityperiodispotentiallyconfoundedwiththe

effectofthesubsidy.

44CHuseastandardsearchtheorytomodeltheincentivesofSSP,andcapturetheeffectofeligibilityandtheSSPsubsidyonlaborsupplyincentivesviatheireffectsonthereservationwage.Thesearchmodelclarifiesthatinthepresenceofheterogeneity,thepoolofworkersemployedatanygivenpointintimemaybeselected,whetherornottherealsoissampleselectionarisingfromemploymentdecisions(e.g.,HamandLalonde1996).

104

Toaddressthesedifficulties,CHproceedbydevelopingalogisticmodelwith

randomeffectsandheterogeneitytoestimateabenchmarkforwelfaretransitionsin

theabsenceofSSP(i.e.,forthecontrolgroup).Thismodelisthencombinedwith

parametricspecificationsofthetreatmenteffectsoverdifferentrangesofthe

programspell,asimpliedbyincentivesinherentinSSP.Thisstepincludesmodeling

theparticipationdecisionandwelfaretransitionsasfunctionsoftheSSPsubsidy

andcurrentandlaggedwelfarestatus.Akeyassumptiontherebyisthatthechosen

controlsforheterogeneityandthefunctionalformrestrictionsaresufficientto

controlforthedynamicselectionbiasintroducedbytheeligibilitywindow.CH

experimentwithdifferentspecificationsofheterogeneity,andprovideample

discussionofthegoodnessoffitofthemodel.Asaresultofthisexercise,theyare

abletoobtainseparateeffectsofeligibilityandSSP.Thisallowsthemtosimulatethe

effectsofdifferentcomponentsoftheprogramandcounterfactualpolicychanges

relatingtothetimepathofthesubsidy.

TheapproachandfindinginCHsuggestthatonemaynotneedastructural

modeltoseparatelyidentifymultipletreatmenteffects,thedynamiceffectsofa

program,ortosimulatetheeffectofalternativepolicies.However,anassumptionon

functionalformisrequired,aswellasharder-to-assessassumptionsontheformof

underlyingheterogeneity.

Toestimatemechanismsunderlyingtheeffectofexperimentalorpolicy

variation,otherpapershaveusedinsightsfromtheorytoaididentificationwithout

estimatingastructuralmodel.Forexample,Schmieder,vonWachter,andBender

(2016)useinsightsfromthestandardsearchmodeltoestimatetheeffectof

105

unemploymentdurationonwages.Arecurringquestionintheanalysisand

evaluationofwelfareandunemploymentprogramshasbeentheeffectof

employmentandunemploymentonproductivityandwages.Ifwagesrisewith

employmentduration,welfare-to-workprogramscanleadtosustainedlaborforce

participation.Incontrast,iflongernonemploymentdurationreduceswages,and

hencethedisincentivetowork,moregenerousbenefitscanleadtoawelfaretrap.

CardandHyslop(2005)findthatincreasedemploymentinthecourseofthe

CanadianSelf-SufficiencyProgramdidlittletoincreasewages.Incontrast,Grogger

(2005)findspositivewageimpactsofemploymentinthecontextofarandomized

evaluationofFlorida’swelfare-to-workprogram.

Fewpapershavedirectlyanalyzedtheeffectofunemploymentdurationon

wages.45Thequestionisdifficultforatleasttworeasons.First,asinCardand

Hyslop(2005),evenwithexogenousvariationinincentivesatthegrouplevel,the

typeofworkeremployedatanygivenpointintheunemploymentspellmaydiffer

betweenthetreatmentandcontrolgroups.46Inotherwords,itisdifficulttofinda

validinstrumentforthedurationofunemployment.Asecondcomplication

arisesbecauseevenifsuchvariationwasavailable,achangeinwagesmightarise

eitherbecauseofachangeinwageoffersorduetoachangeinreservationwages.

Toaddressthesedifficulties,Schmieder,vonWachter,andBender(2016)

usethefactthatthecanonicalsearchmodelhasthestrongpredictionthatforward-

45AnexceptionisAddisonandBlackburn(2000),whodiscusssomeoftheissuesthatarise.Alargernumberofpapershasaddressedthequestionofdurationdependenceinunemploymentspells.SeeKroft,Lange,andNotowidigdo(2013)andreferencestherein.46Thisbiasarisesevenintheabsenceofdifferencesinparticipation.

106

lookingindividualsvaluingfutureunemploymentinsurancebenefitswillrespondto

abenefitextensionbyraisingtheirreservationwagewellbeforebenefitexhaustion.

Unlessreservationwagesdonotbind,thisimpliesthatextensionsinUIdurations

shouldleadtoincreasesinobservedreemploymentwagesthroughoutthespell.In

contrasttothisprediction,Schmiederetal.(2016)findinthecontextof

discontinuousincreasesinunemploymentinsurancedurationsinGermany,that

reemploymentwagesatdifferentpointsoftheunemploymentspellsareunaffected.

Theydeducethatreservationwageslikelyhadlittleeffectonobservedwagesand

hencethattheeffectofanincreaseinUIbenefitdurationsonwagesarosefroman

effectoftheriseinnonemploymentdurationsonofferedwages.Inthiscase,an

exogenousincreaseinUIbenefitdurationscanbeusedasaninstrumenttoestimate

theeffectofnonemploymentdurationonwages.47

Anotherstudyincorporatingtheoreticalinsightsfromsearchtheoryintoan

empiricalstudyofunemploymentinsuranceisthatofDellaVigna,Lindner,Reizer,

andSchmieder(2016),whoanalyzeachangeinthetimepathofUIbenefitsin

Hungarythatkeptbenefitsinthefinaltierunchanged.Theyusethisvariationto

structurallyestimatekeyparametersofamodelwithreferencedependence,and

findthemodeldoesquitewellcomparedtoanalternativemodelthatexplainsthe

patternbasedon(unspecified)heterogeneity.Theincorporationofnon-standard

47Theauthorsarguethattheirtestexcludesanyaffectoftheworker'soutsideoptiononwages,andhencethefindingsarenotspecifictotheparticularmodel.

107

behavioralassumptionsintotheevaluationoflabormarketprogramisstillinits

infancy,butisanimportantavenueforfutureresearch.48

Acloselyrelatedtopictothequestionofmechanismsistheextrapolationof

experimentalevidencetoconsidertheimpactsofnewpolicies,notincludedinthe

originalevaluation.Thevalueofsuchextrapolationshaslongbeenoneofthe

primaryargumentsinfavorofstructuralmodeling(andagainstrelianceonpurely

experimentalevidence),butsomescholarshavefoundoutwaystosynthesizethe

approaches.Themainchallengehereistobridgebetweentherelativelyfew

parametersthatarecleanlyidentifiedbyanexperimentandthelargersetof

parametersthatareneededtocharacterizemoststructuralmodels.

Onewaytodothisistostartwithacharacterizationofstructuralbehavior

thatissimpleenoughtobecapturedwithintheexperimentalevidence.Forexample,

ifoneassumesthatthelaborsupplyfunctionischaracterizedbyconstantincome

and(compensated)substitutionelasticities,thentheestimatesoftheseparameters

thatareidentifiedbytheNITexperimentsaresufficienttoidentifytheeffectsof

alternativeNITparametersthatwerenotincludedintheexperimentaltreatments.

Adrawbackofsuchanapproachisthattherangeofpoliciesthatcanbeexaminedis

limited.Theapproachcanbeextended,ofcourse,toestimateamorecomplex

structuralmodelthateitherreliesonadditionalstatisticalandtheoretical

assumptions,additionalnon-experimentalmoments,orboth.Inanyevent,thissort

48Forsomeexceptions,see,LemieuxandMacLeod(2000),DellaVignaandPaserman(2005),Oreopoulos(2007);morerecently,Chan(2014)examinestheroleoftime-inconsistencyinthecontextoftherandomizedevaluationofFloridaTransitionProgram.Babcock,Congdon,Katz,andMullainathan(2012)giveanoverviewofthepotentialimportanceofbehavioralassumptionfortheevaluationofpublicprograms

108

ofexerciseisonmoresolidgroundwhentryingtointerpolatetovalueswithinthe

rangeoftaxparametersincludedintheexperimentthanwhentheseparameters

needtobeextrapolatedoutsideofthatrange.

Amorerecent,closelyrelatedapproachisknownasthe“sufficientstatistics”

approach(Chetty2009).Here,thegoalistocharacterizeoptimalpolicy.Starting

withafullycharacterized(butusuallynotoverlycomplex)structuralmodel,itis

oftenpossibletoderiveexpressionsforsocialwelfare,orfortheoptimalpolicy,that

dependonlyonasmallnumberofreduced-formparameters.Forexample,theBaily-

Chetty(Baily1978,Chetty2006)formulaforoptimalunemploymentinsurance

benefitsexpressestheoptimalbenefitlevelintermsoftheelasticityof

unemploymentdurationwithrespecttoUIbenefits,andtheincomeand

substitutioneffectsontheexithazardfromunemployment.Ifonehadexperimental

evidenceregardingtheseeffects,onecouldusetheformulatoderivetheoptimal

policy(e.g.,Chetty2008,Card,Chetty,andWeber2007).

Ofcourse,anysufficientstatisticsapproachisdependentuponthevalidityof

theunderlyingstructuralmodel–thereisnoassurancethatthetruestructural

modelgeneratesthesamesufficientstatisticsasdoestheonepositedbythe

researcher.Insomecases,thismayincludearelevantclassofmodelsandhence

provideadegreeofrobustness.Forexample,Chetty(2009)givestheexampleof

heterogeneityintreatmenteffects,wheretheoptimalpolicydependsonlyonthe

meaneffect.Yet,itcanbehardtoknowwhichassumptionsinthestructuralmodel

matter,andgenerallytheassumptionsneededtoderivethesufficientstatisticsare

fairlystrong.Atapracticallevel,conclusionsaboutoptimalpoliciesmayinvolve

109

extrapolatingveryfarfromtherangeofpolicyvariationincludedintheexperiment,

whichmeansrelyingstronglyonthevalidityofthetheoreticalmodel.Inthis

context,apotentialdrawbackofsufficientstatisticsisthatincontrasttoexplicitly

structuralworktheempiricalfitofthemodelagainstthedatacannotbeassessed.

Analternativeapproachtoobtainaframeworkforpolicyextrapolation

basedonexperimentalvariationistoestimate,orcalibrate,afullstructuralmodel,

usingexperimentalevidencetoaidinidentifying(someof)thenecessary

parameters.Oneapproachistofixindividualparametersatthevaluesindicatedby

experiments,thencalibrateorstructurallyestimatetheremainder.Thisapproachis

pursued,forexample,byDavidsonandWoodbury(1997),whousetheIllinois

reemploymentbonusexperimenttoestimatetheparametersofasearchcost

function,thencombinethisfunctionwithcalibratedvalues,derivedfromnon-

experimentaldata,forotherparametersoftheirmodelofoptimalUIbenefits.

Anotherapproachistouseexperimentaldatatofitafullstructuralmodel,butkeep

themodelsufficientlysimplesuchthatthemainparametersofthemodelare

identifiedbytheavailablevariation,asforexampleinDellaVigna,Lindner,Reizer,

andSchmieder(2016).Analternativeistoestimatethestructuralmodelsolelywith

non-experimentaldatatoestimateastructuralmodel,thenuseexperimental

evidencetovalidatepredictionsthatthemodelmakesforparticularreduced-form

comparisons(e.g.,ToddandWolpin2006).49

49Anotherapproachtoextrapolationthatcanbeviewedasahybridbetweenstructuralandreducedformapproachesisuseexperimentalvariationintheincentivetotakeupaprogramtoeffectivelyestimateastructuralmodelofthecompliancerate(e.g.,HeckmanandVytlacil2005).Asdescribedin

110


Insomecases,theexperimentaldesigncanbestructuredtohelpuncoverthe

mechanismsunderlyingthetreatmenteffectoftheprogram.Economictheorymay

beparticularlyusefulhereinconnectingfundamentalparametersandmechanisms

tothetypesofimpactsthatcanbemeasuredwithexperiments.Oneapproachisto

designanexperimentthattargetsaparticularmechanismofinterest,ratherthan

identifyingtheeffectofawell-definedprogramthatmightbeimplemented.Kling,

Congdon,Ludwig,andMullainathan(thisvolume)refertothisasa“mechanism

experiment,”distinguishingitfromaprogramevaluation.Standardmodelsinlabor

economicsorotherfieldsmayprovideusefulcharacterizationsofthebehavioral

mechanismstobetested.Forexample,modelsofhumancapitalinvestmenthave

implicationsforthefactorsdeterminingtakeupandsuccessoftrainingorschooling

programsthatmaybeusefulinstructuringtheexperimentaldesign.

Acloselyrelatedapproachistointroducemultipletreatmentarms,with

programvariationamongthemthatcanhelpuncoverunderlyingparameters.The

NITexperimentsdiscussedabovepresentastraightforwardexampleofacongenial

marriageofclassic(static)laborsupplytheoryandtheexperimentaldesign.As

discussedabove,aslongasbothincomeandsubstitutioneffectsarelinearinthe

relevanttaxmeasure,multipletreatmentsmanipulatingboththebasetransferand

Section5.d,undercertaincircumstancesthisallowsonetoobtainthefulldistributionofmarginaltreatmenteffectsandhencetoextrapolate.

111

themarginaltaxratecanbeusedtoseparatelyestimatetheincomeandsubstitution

effects.50

TheevaluationoftheSSPprogramdiscussedaboveisagoodexampleofan

experimentthatwouldhavebenefitedfromasecondtreatmentarm.Sucha

treatmentmighthaverandomlyvariedtheincentivetobecomeeligibleforthe

(randomlyassigned)worksubsidyinthemainphaseoftheprogram.More

generally,decisionsandprogramsinvolvinginter-temporaltradeoffsmaybean

areainwhichmorecomplexexperimentscanbeparticularlyinsightful.For

example,typicalUIsystemsinvolveexpiringbenefits,orJSAprogramsinvolve

sanctions;thetimingofbenefitexhaustion,reemploymentbonuses,orsanctionshas

beenshowntohaveimportantempiricaleffectsonreemploymentrates(e.g.,Meyer

1995,Black,Smith,Berger,andNoel2003,Schmieder,vonWachter,andBender

2012).Hence,experimentsthattryandgetattheunderlyingbehavioral

mechanismsmayprovideimportantinsightsintohowtheseprogramsaffectlabor

supplychoices.Knowledgeofsuchmechanismsisalsoacrucialinputinoptimizing

thedeliveryofinsuranceorassistanceinthelabormarket.Forexample,thiscould

involveareemploymentbonusthatdeclinesovertime,oronethatisavailableonly

tothosewhosurvivetoaspecifiedpoint.Byrandomlyvaryingtheamount,slope,or

intervals,onemaygaininsightsintothenatureofinter-temporaldecisionmaking

relevantfortheseprograms.Inter-temporalchoiceisalsoanareawheretheoryis

likelytobehelpfultoprovideidentifyingstructure.Forexample,ifthegoalwould

50Multipletreatmentsmaynotbenecessary.Forexample,withappropriatedataandassumptions,onecouldinprincipleexperimentallyvarycompensatedwagechangestoidentifythecompensatedsubstitutioneffect.Thismorecloselyresemblesamechanismexperiment.

112

betolearnaboutpotentialbehavioralbiases,amodeloftheeffectofparticular

biasescanyieldinsightfulpredictionsforjobsearchbehavior(e.g.,DellaVigna,

Lindner,ReizerandSchmieder2016).51

Theusefulnessoftheoryininformingexperimentaldesignshinges,ofcourse,

onthemodelbeingcorrect.Tomitigatetherelianceonparticularassumptions(e.g.,

onfunctionalforms)inprincipleonecoulduserevealedpreferenceargumentsto

generaterobustpredictionsfromtheorythatarethenusedindesignofan

experiment.E.g.,onecoulduseresultsobtainedbyPinto(2015)orKlineandTartari

(2016)todevisemultipletreatmentarmstotesttheimpliedrestrictions.However,

amodelmaynotbenecessarytoenrichtheexperimentaldesigntostudyunderlying

channels.TheSSPexampleshowsthatabasicunderstandingoftheincentivesand

thenatureoftheprogramcanbesufficienttodesignanRCTthatuncoversthe

potentiallycomplexmechanismsunderlyingthesimpleSSPevaluation.

V. Conclusion

Becausetheyallowresearcherstocontrolassignmentintotreatment,

randomizedcontrolledtrialsaretheGoldStandardforprogramevaluation.But

whilerandomassignmentsolvestheselectionproblem,thereareabroadrangeof

additionalrelevantdesignissuesthatariseroutinelyintheanalysisofcentral

economicquestionsthatarenotsolvedbyrandomassignmentonitsown.Inthis

51Asalreadymentionedinthediscussionofheterogeneoustreatmenteffects,anotherareawheretheoryislikelytobeusefulistounderstandthedeterminationofcompliancerates.Asdiscussedabove,themainideaistoexperimentallymanipulatetheincentivetoparticipateandusethevariationtotraceoutthemarginaltreatmenteffect(MTE)curve.Theoreticalconsiderationscantellushowtorealisticallyvarythecostofcomplianceandhencebeabletoestimatethefullrangeoftreatmenteffects.

113

chapter,wehavediscussedsixsuchdesignissuesindepth,including(1)spillover

effectsandinteractionsbetweenindividuals,leadingtoafailureofSUTVA;(2)

impactsonoutcomesthatareonlyobservedconditionalonindividualchoicesand

henceareendogenous,suchaswages,hoursworked,orparticipationinafollow-up

survey;(3)heterogeneityintreatmenteffectsbetweenexperimentalsitesand

observedpopulationgroups,or(4)imperfectcomplianceandheterogeneityin

unobservedcharacteristics,bothofwhichcanmakeithardtointerprettreatment

effectsandextrapolatetootherprograms;(5)hiddentreatmenteffectsarising

becausecontrolsalsoreceiveversionsofthetreatment;and(6)theunderstanding

ofthemechanismsbehindthetreatmenteffect,inparticularinthepresenceof

multipletreatment.

Wediscussthesedesignissuesandsolutionsinthecontextofsocial

experimentsintheUnitedStateslabormarket,whichhaveprovidedmostofwhat

weknowaboutthefunctioningofthemainlabormarketprograms.Ofcourse,the

laboreconomicsliteraturehasbeenwellawareaboutthelimitationsofexperiments

ingeneralandsomeofthesedesignissuesinparticular.Wehavereviewed

approachesthatcanbeusedtoaddressthedesignissuesinthecontextof

randomizedexperiments.Thisincludesapproachesthatcanbeappliedonce

randomizationiscompleted,andwaystomodifytheexperimentsitselftoaddress

theconcernsweidentify.

Whilewediscussdesignissuesinthecontextofexperimentsinthelabor

market,theseissuescanariseinallareasthathaveseenactiveexperimental

activities,includingfieldexperimentsdiscussedelsewhereinthisvolume.Hencethe

114

solutionsweidentifycanbeappliedtoabroadrangeofquestionsandshouldbe

usefulforawiderangeofresearchersinterestedinharnessingthepower

randomizedcontrolledtrials.

Weclosewithabriefdiscussionofrecenttrendsinlabormarketsocial

experiments,severalofwhichhighlighttheneedtopaymoreattentiontothe

potentialdesignissuesinexperimentalevaluationsthatwediscuss.One

overarchingtrend,cuttingacrossseveralareasofresearch,isthatacademic

economistshavebecomemoreinvolvedwiththeimplementationofexperiments.In

laboreconomics,forexample,thishasmeantashiftawayfromrandomized

controlledtrialsimplementedbylarge,specializedpolicyconsultingfirms(e.g.,

Mathematica,MDRC,orAbtAssociates).Forexample,severalexperimentshave

evaluatedtakeupofactualgovernmentprogramswithinthecontextofservices

providedbyH&RBlock(e.g.,Bettinger,Long,Oreopoulos,andSanbonmatsu2012).

Anotherexampleistheincreasingnumberofrandomizedtrialsevaluatingtherole

ofeconomicincentivesforteachers(e.g.,Fryer,Levitt,List,andSadoff2012;Fryer

2013;Springeretal.2010).Similarly,experimentstakingplacewithinprivate

businesseshavealsobeenquitesuccessful(e.g.,Bandiera,Barankay,andRasul

2009).

Thegreaterinvolvementofacademiceconomistsharborsbothupside

potential,ifresearchersimplementstate-of-the-arttechniquestoaddressadditional

designissues,andchallenges,asthereisabroadrangeofissuesthatmustbe

consideredandmonitoredwhenimplementinganexperimentalevaluationofan

existingprogramoranew,complextreatmentinareal-worldsetting.Wehopethe

115

discussionofthedesignissuesinthischapter,aswellasoursummaryofthe

practicalaspectsofimplementingsocialexperiments,willprovideausefulguidefor

thoseinterestedinimplementingsuchsocialexperiments.

Asecond,relatedtrendhasbeenamovementtowardevaluatingtopicsin

personneleconomics(e.g.,theresponseofteacherstoincentivepayprograms)as

distinctfromgovernmentsocialprograms.Theseareoftenconductedwithin

particularfirms,andimplicateanumberofthedesignissueswediscuss,most

notablyissuesofsiteeffectsandheterogeneity.

Athirdimportanttrendhasbeentheuseoftheactualonlinelabormarket,

forwhatamounttofieldexperimentsinthetaxonomywesetoutattheoutset(e.g.,

Pallais2014).TheInternetmaywellprovideausefulresourceforfuturesocial

experimentsaswell.Akeyadvantagemaybethatresearchersmaybebeableto

bettercontroltheenvironment,perhapsallowingthemtoimplementmorecomplex

studydesignsthataddresssomeoftheissueswepose.

References

Addison,J.T.,Blackburn,M.L.2000.Theeffectsofunemploymentinsuranceonpostunemploymentearnings.LabourEconomics,7(1),21-53.

Ahn,H.,Powell,J.L.1993.Semiparametricestimationofcensoredselectionmodelswithanonparametricselectionmechanism.JournalofEconometrics,58(1),3-29.

Alcott,H.2015.Siteselectionbiasinprogramevaluation.QuarterlyJournalofEconomics,130(3),1117-1165.

Altonji,J.G.,Blank,R.M.1999.Raceandgenderinthelabormarket.HandbookofLaborEconomics,3(3),3143-3259.

Anderson,M.2008.Multipleinferenceandgenderdifferencesintheeffectsofearlyintervention:AreevaluationoftheAbecedarian,PerryPreschool,andEarly

116

Trainingprojects.JournaloftheAmericanStatisticalAssociation,103(484),1481-1495.

Angrist,J.D.,Hull,P.,Pathak,P.A.,Walters,C.2015.Leveraginglotteriesforschoolvalue-added:Testingandestimation.(WorkingPaper21748).NationalBureauofEconomicResearch.

Angrist,J.D.,Imbens,G.W.1995.Two-stageleastsquaresestimationofaveragecausaleffectsinmodelswithvariabletreatmentintensity.JournaloftheAmericanStatisticalAssociation,90(430),431-442.

Angrist,J.D.,Imbens,G.W.,Rubin,D.B.1996.Identificationofcausaleffectsusinginstrumentalvariables.JournaloftheAmericanStatisticalAssociation,91(434),444-455.

Angrist,J.D.,Krueger,A.B.1999.Empiricalstrategiesinlaboreconomics."HandbookofLaborEconomics,3,1277-1366.

Ashenfelter,O.,Ashmore,D.,DeschênesO.2004.Dounemploymentinsurancerecipientsactivelyseekwork?EvidencefromrandomizedtrialsinfourUSstates.JournalofEconometrics,125(1-2),53-75.

Ashenfelter,O.,Plant,M.W.1990.Nonparametricestimatesofthelabor-supplyeffectsofnegativeincometaxprograms.JournalofLaborEconomics,8(1),S396-S415.

Athey,S.,Imbens,G.2016.Theeconometricsofrandomizedexperiments.HandbookofFieldExperiments(forthcoming).

Babcock,L.,Congdon,W.J.,Katz,L.F.,Mullainathan,S.2012.Notesonbehavioraleconomicsandlabormarketpolicy.IZAJournalofLaborPolicy,1(2),1-14.

Baily,M.N.1978.Someaspectsofoptimalunemploymentinsurance.JournalofPublicEconomics,10(3),379-402.

Baird,S.,Bohren,A.,McIntosh,C.,Ozler,B.2015.Designingexperimentstomeasurespillovereffects,secondversion(WorkingPaper15-021).PennInstituteforEconomicResearch.

Bandiera,O.,Bankaray,I.,Rasul,I.2009.Socialconnectionsandincentivesintheworkplace:Evidencefrompersonneldata.Econometrica,77(4):1047-1094.

Barnes,M.S.,Benus,J.,CooperJ.,Dugan,M.K.,KirschM.P.,Johnson,T.2014.U.S.DepartmentofLaborJobsCorpsProcessStudyFinalReport.U.S.DepartmentofLabor.[Availableat:http://wdr.doleta.gov/research/keyword.cfm?fuseaction=dsp_resultDetails&pub_id=2538&mp=y].

117

Barnow,B.S.2000.Exploringtherelationshipbetweenperformancemanagementandprogramimpact:AcasestudyoftheJobTrainingPartnershipAct.JournalofPolicyAnalysisandManagement,19(1),118-141.

Becerra,R.M.,Lew,V.,Mitchell,M.N.,Ono,H.1998.Finalreport:CaliforniaWorkPaysDemonstrationProject,reportofthefirstforty-twomonths.SchoolofPublicPolicyandSocialResearch,UniversityofCalifornia-LosAngeles,LosAngeles.

Beecroft,E.,Lee,W.,Long,D.,Holcomb,P.A.,Thompson,T.S.,Pindus,N.,O’Brien,C.,Bernstein,J.2003.TheIndianawelfarereformevaluation:Five-yearimpacts,implementation,costsandbenefits.AbtAssociates:Cambridge,MA.

Bell,S.H.,Bloom,H.S.,Cave,G.,Doolittle,F.,Lin,W.,Orr,L.L.1994.TheNationalJTPAStudy:Overview:Impacts,benefits,andcostsofTitleII-A.AbtAssociates:CambridgeMA

Bell,S.H.,Orr,L.L.,Burstein,N.R.1987.EvaluationoftheAFDCHomemaker-HomeHealthAideDemonstrations:Overviewofevaluationresults.AbtAssociates:CambridgeMA.

Benus,J.,Yamagata,E.P.,Wang,Y.,Blass,E.2008.ReemploymentandEligibilityAssessment(REA)study:FY2005initiative:Finalreport.IMPAQInternational,1-173.

Bertrand,M.,Mullainathan,S.2004.AreEmilyandGregmoreemployablethanLakishaandJamal?Afieldexperimentonlabormarketdiscrimination.AmericanEconomicReview,94(4),991-1013.

Bettinger,E.,Long,B.T.,Oreopoulos,P.,Sanbonmatsu,L.2012.Theroleofapplicationassistanceandinformationincollegedecisions:ResultsfromtheH&RBlockFAFSAexperiment.QuarterlyJournalofEconomics,127(3),1205-1242.

Bitler,M.P.,Gelbach,J.B.,Hoynes,H.W.2006.Whatmeanimpactsmiss:Distributionaleffectsofwelfarereformexperiments.TheAmericanEconomicReview,96(4),988-1012.

Black,D.A.,Galdo,J.,Smith,J.A.2007.EvaluatingtheWorkerProfilingandReemploymentServicesSystemusingaregressiondiscontinuityapproach.TheAmericanEconomicReview,97(2),104-107.

Black,D.A.,Smith,J.A.,Berger,M.C.,NoelB.J.2003.Isthethreatofreemploymentservicesmoreeffectivethantheservicesthemselves?EvidencefromrandomassignmentintheUIsystem.AmericanEconomicReview,93(4),1313-1327.

118

Bloom,H.S.,Hill,C.J.,RiccioJ.A.2005.Modelingcross-siteexperimentaldifferencestofindoutwhyprogrameffectivenessvaries.InBloom,H.S.,ed.,Learningmorefromsocialexperiments:Evolvinganalyticapproaches.RussellSageFoundation,37-74.

Bloom,D.,Kemple,J.J.,Morris,P.,Scrivener,S.,Verma,N.,Hendra,R.2000.FinalreportonFlorida’sinitialtime-limitedwelfareprogram.ManpowerDemonstrationResearchCorporation:NewYork,December.

Bloom,H.S.,Orr,L.L.,Bell,S.H.,Cave,G.,Doolittle,F.,Lin,W.,Bos,J.M.1997.ThebenefitsandcostsofJTPATitleII-Aprograms:KeyfindingsfromtheNationalJobTrainingPartnershipActStudy.JournalofHumanResources,32(3),549-576.

Bloom,D.,Scrivener,S.,Michalopoulos,C.,Morris,P.,Hendra,R.,Adams-Ciardullo,D.,Walter,J.2002.JobsFirst:FinalreportonConnecticut'swelfarereforminitiative.ManpowerDemonstrationResearchCorporation.

Blundell,R.,Bozio,A.,Laroque,G.2011.Laborsupplyandtheextensivemargin.TheAmericanEconomicReview,101(3),482-486.

Blundell,R.,Dias,M.C.,Meghir,C.,Reenen,J.V.2004.Evaluatingtheemploymentmmpactofamandatoryjobsearchprogram.JournaloftheEuropeanEconomicAssociation,2(4),569-606.

Brinch,C.,Mogstad,M.,Wiswall,M.Forthcoming.BeyondLATEwithadiscreteinstrument.JournalofPoliticalEconomy.

Buchinsky,M.1994.ChangesintheUSwagestructure1963-1987:Applicationofquantileregression.Econometrica:JournaloftheEconometricSociety,62(2),405-458.

Burghardt,J.,Schochet,P.Z.,McConnell,S.,Johnson,T.,Gritz,R.M.,Glazerman,S.,Homrighausen,J.,Jackson,R.2001.DoesJobCorpswork?SummaryoftheNationalJobCorpsStudy.MathematicaPolicyResearch:Princeton,NJ.

Card,D.,Chetty,R.,Weber,A.2007.Cash-on-handandcompetingmodelsofintertemporalbehavior:Newevidencefromthelabormarket.TheQuarterlyJournalofEconomics,122(4),1511-1560.

Card,D.,Hyslop,D.R.2005.Estimatingtheeffectsofatime-limitedearningssubsidyforwelfare-leavers.Econometrica,73(6),1723-1770.

Card,D.,Kluve,J.,Weber,A.2010.Activelabormarketprograms:Ameta-analysis.TheEconomicJournal,120(548),F452-477.

119

Cave,G.,Bos,H.,Doolittle,F.,Toussaint,C.1993.JOBSTART.Finalreportonaprogramforschooldropouts.ManpowerDemonstrationResearchCorp:NewYork.

Cerqua,A.,Pellegrini,G.2014.Dosubsidiestoprivatecapitalboostfirms'growth?Amultipleregressiondiscontinuitydesignapproach.JournalofPublicEconomics,109(C),114-126.

Chan,M.K.2014.Welfaredependenceandself-control:Anempiricalanalysis.Workingpaper,EconomicsDisciplineGroup,UTSBusinessSchool,UniversityofTechnology,Sydney.

Chetty,R.2006.Ageneralformulafortheoptimallevelofsocialinsurance.JournalofPublicEconomics,90(10),1879-1901.

Chetty,R.2008.Moralhazardversusliquidityandoptimalunemploymentinsurance.JournalofPoliticalEconomy,116(2),173-234.

Chetty,R.2009.Isthetaxableincomeelasticitysufficienttocalculatedeadweightloss?Theimplicationsofevasionandavoidance.AmericanEconomicJournal:EconomicPolicy,1(2),31-52.

Chetty,R.,Friedman,J.N.,Rockoff,J.E.2014.MeasuringtheimpactsofteachersI:Evaluatingbiasinteachervalue-addedestimates.AmericanEconomicReview,104(9),2593-2632.

Chodorow-Reich,G.,Karababounis,L.2016.Thelimitedmacroeconomiceffectsofunemploymentbenefitextensions(WorkingPaper22163).NationalBureauofEconomicResearch.

Coglianese,J.J.(WorkingPaper).2015.Dounemploymentinsuranceextensionsreduceemployment?Mimeo,HarvardUniversity.

Corson,W.,Decker,P.,Dunstan,S.M.,Kerachsky,S.1991.Pennsylvaniareemploymentbonusdemonstration:Finalreport(UnemploymentInsuranceOccasionalPaper92-1).U.S.DepartmentofLabor:Washington,DC.

Corson,W.,Long,D.,Nicholson,W.1984.EvaluationoftheCharlestonClaimantPlacementandWorkTestDemonstration.MathematicaPolicyResearch.

Crépon,B.,Duflo,E.,Gurgand,M.,Rathelot,R.,Zamora,P.2013.Dolabormarketpolicieshavedisplacementeffects?Evidencefromaclusteredrandomizedexperiment.TheQuarterlyJournalofEconomics,128(2),531-580.

Davidson,C.,Woodbury,S.A.1997.Optimalunemploymentinsurance.JournalofPublicEconomics,64(3),359-387.

120

Deaton,A.2010.Instruments,randomization,andlearningaboutdevelopment.JournalofEconomicLiterature,48(2),424-455.

Dehejia,R.H.,Wahba,S.2002.Propensityscore-matchingmethodsfornonexperimentalcausalstudies.ReviewofEconomicsandStatistics,84(1),151-161.

DellaVigna,S.,Lindner,A.,Reizer,B.,Schmieder,J.F.2016.Reference-dependentjobsearch:evidencefromHungary(WorkingPaper22257).NationalBureauofEconomicResearch.

DellaVigna,S.,Paserman,M.D.2005.Jobsearchandimpatience.JournalofLaborEconomics,23(3),527-588.

DiNardo,J.,Fortin,N.M.,Lemieux,T.1996.Labormarketinstitutionsandthedistributionofwages,1973-1992:ASemiparametricApproach.Econometrica,64(5),1001-1044.

Dorsett,R.,Hendra,R.,Robins,P.K.,Williams,S.2013.Canpost-employmentservicescombinedwithfinancialincentivesimproveemploymentretentionforwelfarerecipients?EvidencefromtheTexasEmploymentRetentionandAdvancementEvaluation.NIESRDiscussionPaperNo.409.

FarberH.S.,Silverman,D.,Wachter,T.2015.Factorsdeterminingcallbackstojobapplicationsbytheunemployed:Anauditstudy(WorkingPaper21689).NationalBureauofEconomicResearch.

Fein,D.J.,Beecroft,E.,Blomquist,J.D.1994.OhioTransitionstoIndependenceDemonstration.FinalimpactsforJOBSandworkchoice.AbtAssociates:Cambridge,MA.

Feller,A.,Grindal,T.,Miratrix,L.W.,Page,L.C.2014.Comparedtowhat?Variationintheimpactsofearlychildhoodeducationbyalternativecare-typesettings.Workingpaper.

Ferracci,M.,Jolivet,G.,vandenBerg,G.J.2010.Treatmentevaluationinthecaseofinteractionswithinmarkets(No.4700).Workingpaper,InstitutefortheStudyofLabor(IZA).

Fraker,T.,Maynard,R.1987.Theadequacyofcomparisongroupdesignsforevaluationsofemployment-relatedprograms.JournalofHumanResources,22(2),194-227.

Freedman,S.,Friedlander,D.,Riccio,J.1994.GAIN:Benefits,costs,andthree-yearimpactsofawelfare-to-workprogram.ManpowerDemonstrationResearchCorp.

121

Freedman,S.,Knab,J.T.,Gennetian,L.A.,Navarro,D.2000.TheLosAngelesJobs-FirstGAINEvaluation:Finalreportonaworkfirstprograminamajorurbancenter.ManpowerDemonstrationResearchCorporation:NewYork.

Fryer,R.,2013.Teacherincentivesandstudentachievement:EvidencefromNewYorkCitypublicschools.JournalofLaborEconomics,31(2),373-427.

Fryer,R.,Levitt,S.D.,List,J.,Sadoff,S.2012.Enhancingtheefficacyofteacherincentivesthroughlossaversion:Afieldexperiment(WorkingPaper18237).NationalBureauofEconomicResearch.

Gautier,P.A.,Muller,P.,Rosholm,M.,Svarer,M.,vanderKlaauw,B.2012.Estimatingequilibriumeffectsofjobsearchassistance(No.9066).CEPRDiscussionPapers.

Gold,S.F.,1971.ThefailureoftheWorkIncentive(WIN)program.UniversityofPennsylvaniaLawReview,119(3),485-501.

Greenberg,D.H.,Robins,P.K.1986.Thechangingroleofsocialexperimentsinpolicyanalysis.JournalofPolicyAnalysisandManagement,5(2),340-362.

Greenberg,D.H.,Shroder,M.2004.Thedigestofsocialexperiments.TheUrbanInstitute,3rdedition.

GreenbergD.H.,Shroder,M.,Onstott,M.1999.Thesocialexperimentmarket.TheJournalofEconomicPerspectives,13(3),157-172.

Grogger,J.2005.Welfarereform,returnstoexperience,andwages:Usingreservationwagestoaccountforsampleselectionbias.TheReviewofEconomicsandStatistics,91(3),490-502.

Gronau,R.1973.Theeffectofchildrenonthehousewife'svalueoftime.JournalofPoliticalEconomy,81(2),S168-S199.

Grossman,J.B.,Roberts,J.,1989.Welfaresavingsfromemploymentandtrainingprogramsforwelfarerecipients.TheReviewofEconomicsandStatistics,71(3),532-537.

Gueron,J.Forthcoming.Thepoliticsandpracticeofsocialexperiments:seedsofarevolution.HandbookofFieldExperiments.

Hagedorn,M.,Karahan,F.,Manovskii,I.,Mitman,K.2015.UnemploymentbenefitsandunemploymentintheGreatRecession:theroleofmacroeffects.FederalReserveBankofNewYorkStaffReport646,revisedFebruary2015.

Hagedorn,M.,Manovskii,I.,Mitman,K.2015.Theimpactofunemploymentbenefitextensionsonemployment:The2014employmentmiracle?(WorkingPaper20884).NationalBureauofEconomicResearch.

122

Ham,J.C.,LaLonde,R.J.1996.Theeffectofsampleselectionandinitialconditionsindurationmodels:Evidencefromexperimentaldataontraining.Econometrica:JournaloftheEconometricSociety,64(1),175-205.

Ham,J.C.,Li,X.,Reagan,P.B.2011.Matchingandsemi-parametricIVestimation,adistance-basedmeasureofmigration,andthewagesofyoungmen.JournalofEconometrics,161(2),208-227.

Hamilton,G.,Freedman,S.,Gennetian,L.,Michalopoulos,C.,Walter,J.2001.Nationalevaluationofwelfare-to-workstrategies:Howeffectivearedifferentwelfare-to-workapproaches?Five-yearadultandchildimpactsforelevenprograms.USDepartmentofHealthandHumanServicesandUSDepartmentofEducation:Washington,DC.

Hamilton,G.andS.Scrivener.2012.Increasingemploymentstabilityandearningsforlow-wageworkerslessonsfromtheEmploymentRetentionandAdvancement(ERA)project.OfficeofPlanning,ResearchandEvaluationReport2012-19.AdministrationforChildrenandFamilies,U.S.DepartmentofHealthandHumanServices.

Harrison,G.W.,List,J.A.2004.Fieldexperiments.JournalofEconomicLiterature,42(4),1009-1055.

Hausman,J.A.1985.Theeconometricsofnonlinearbudgetsets.Fisher-ShultzlecturefortheEconometricSociety,Dublin:1982.Econometrica,53(6),1255-1282.

Hausman,J.A.,Wise,D.A.1979.Attritionbiasinexperimentalandpaneldata:TheGaryIncomeMaintenanceExperiment.Econometrica,47(2),455-73.

Heckman,J.J.1979.Sampleselectionbiasasaspecificationerror.Econometrica,47(1),153-61.

Heckman,J.J.2010.Buildingbridgesbetweenstructuralandprogramevaluationapproachestoevaluatingpolicy.JournalofEconomicLiterature,48(2),356-98.

Heckman,J.,Hohmann,N.,Smith,J.,Khoo,M.2000.Substitutionanddropoutbiasinsocialexperiments:Astudyofaninfluentialsocialexperiment.TheQuarterlyJournalofEconomics,115(2),651-694.

Heckman,J.J.,Hotz,V.J.1989.Choosingamongalternativenonexperimentalmethodsforestimatingtheimpactofsocialprograms:Thecaseofmanpowertraining.JournaloftheAmericanstatisticalAssociation,84(408),862-874.

Heckman,J.J.,LaLonde,R.J.,Smith,J.A.1999.Theeconomicsandeconometricsofactivelabormarketprograms.HandbookofLaborEconomics,3,1865-2097.

123

Heckman,J.J.,Smith,J.A.1995.Assessingthecaseforsocialexperiments.TheJournalofEconomicPerspectives,9(2),85-110.

Heckman,J.J.,Smith,J.,Clements,N.1997.Makingthemostoutofprogrammeevaluationsandsocialexperiments:accountingforheterogeneityinprogrammeimpacts.ReviewofEconomicStudies,64(4),487-535.

Heckman,J.J.,Vytlacil,E.2005.Structuralequations,treatmenteffects,andeconometricpolicyevaluation.Econometrica,73(3),669-738.

Herrem,J.W.,Schmitt,L.C.1983.Eligibilityreviewpilotprojecthandbook.WisconsinDepartmentofIndustry,Labor,andHumanRelations:Madison,WI.

Holland,P.W.1986.Statisticsandcausalinference.JournaloftheAmericanStatisticalAssociation,81(396),945-960.

Horowitz,J.L.,Manski,C.F.2000.Nonparametricanalysisofrandomizedexperimentswithmissingcovariateandoutcomedata.JournaloftheAmericanStatisticalAssociation,95(449),77-84.

Hotz,J.1992.Recentexperienceindesigningevaluationsofsocialprograms:ThecaseoftheNationalJTPAstudy.InGarfinkel,I.,Manski,C.,eds.,Evaluatingwelfareandtrainingprograms,Cambridge,MA:HarvardUniversityPress:76-114.

Hotz,J.,Imbens,G.,Klerman,J.2006.Evaluatingthedifferentialeffectsofalternativewelfare-to-worktrainingcomponents:AreanalysisofthecaliforniaGAINprogram.JournalofLaborEconomics,24(3),521-566.

Jackson,K.C.,Rockoff,J.E.,Staiger,D.O.2014.Teachereffectsandteacher-relatedpolicies.Annu.Rev.Econ,6(1),801-825.

Jacobson,L.S.2009.Strengtheningone-stopcareercenters:Helpingmoreunemployedworkersfindjobsandbuildskills.HamiltonProjectDiscussionPaper2009-01,April:TheBrookingsInstitution,WashingtonDC.

Jaggers,M.1984.ERPpilotprojectfinalreport.WisconsinDepartmentofIndustry,Labor,andHumanRelations:Madison,WI.

Johnson,T.R.,Pfiester,J.M.,West,R.W.,Dickinson,K.P.1984.Designandimplementationoftheclaimantplacementandworktestdemonstration.SRIInternational:MenloPark,CA.

Johnson,W.,Kitamura,Y.,Neal,D.2000.Evaluatingasimplemethodforestimatingblack-whitegapsinmedianwages.AmericanEconomicReview,90(2),339-343.

124

Johnston,A.C.,Mas,A.2015.Potentialunemploymentinsurancedurationandlaborsupply:Theindividualandmarket-levelresponsetoabenefitcut.Unpublishedworkingpaper.PrincetonUniversity.

Kane,T.J.,Staiger,D.O.2008.Estimatingteacherimpactsonstudentachievement:Anexperimentalevaluation(WorkingPaper14607).NationalBureauofEconomicResearch.

Keane,M.P.2010.Structuralvs.atheoreticapproachestoeconometrics.JournalofEconometrics,156(1),3-20.

Keeley,M.C.,Robins,P.K.,Spiegelman,R.G.,West,R.W.1978.Theestimationoflaborsupplymodelsusingexperimentaldata.TheAmericanEconomicReview,68(5),873-887.

Kehrer,K.C.,Moffitt,R.A.,eds.1976.TheGaryincomemaintenanceexperiment:Initialfindingsreport.IndianaUniversity:Gary,Ind.

Kemple,J.J.,Friedlander,D.,FellerathV.1995.Florida'sProjectIndependence.Benefits,costs,andtwo-yearimpactsofFlorida'sJOBSprogram.ManpowerDemonstrationResearchCorporation:NewYork.

Kershaw,D.,Fair,J.1976.TheNewJerseyincomemaintenanceexperiment.Volume1:Operations,SurveysandAdministration.AcademicPress:NewYork.

Klepinger,D.H.,Johnson,T.R.,Joesch,J.M.,Benus,J.M.1997.EvaluationoftheMarylandunemploymentinsuranceworksearchdemonstration(UnemploymentInsuranceOccasionalPaper98-2).U.S.DepartmentofLabor,EmploymentandTrainingAdministration,UnemploymentInsuranceService:WashingtonDC.

Klepinger,D.H.,Johnson,T.R.Joesch,J.M.,2002.Effectsofunemploymentinsurancework-searchrequirements:TheMarylandexperiment.Industrial&LaborRelationsReview,56(1),pp.3-22.

Klerman,J.A.,Minzner,A.,Harkness,J.,Mills,S.,Cook,R.,Savidge-Wilkins,G.2013.Designreport:ImpactevaluationofreemploymentandeligibilityassessmentProgram.AbtAssociates:May7.

Kline,P.,Tartari,M.2016.Boundingthelaborsupplyresponsestoarandomizedwelfareexperiment:Arevealedpreferenceapproach.AmericanEconomicReview,106(4),972-1014.

Kline,P.,Walters,C.2014.Evaluatingpublicprogramswithclosesubstitutes:ThecaseofHeadStart.UCBerkeleyInstituteforResearchonLaborandEmploymentWorkingPaper#123-14.

125

Kling,J.R.,Liebman,J.B.,Katz,L.F.2007.Experimentalanalysisofneighborhoodeffects.Econometrica,75(1),83-119.

Kling,J.R.,J.Ludwig,B.CongdonandS.Mullainathan.Socialpolicy:Mechanismexperimentsandpolicyevaluations.HandbookofFieldExperiments(forthcoming).

Knox,V.W.,Miller,C.,Gennetian,L.A.2000.Reformingwelfareandrewardingwork:AsummaryofthefinalreportontheMinnesotaFamilyInvestmentProgram(Vol.8).ManpowerDemonstrationResearchCorporation,NewYork.

Kornfeld,R.,Bloom,H.S.1999.Measuringprogramimpactsonearningsandemployment:Dounemploymentinsurancewagereportsfromemployersagreewithsurveysofindividuals?JournalofLaborEconomics,17(1):168-97.

Kroft,K.,Lange,F.,Notowidigdo,M.J.2013.Durationdependenceandlabormarketconditions:Evidencefromafieldexperiment.TheQuarterlyJournalofEconomics,128(3),1123-1167.

Krueger,A.B.,Mueller,A.I.2016.Acontributiontotheempiricsofreservationwages.AmericanEconomicJournal:EconomicPolicy,8(1),142-179.

LaLonde,R.J.1986.Evaluatingtheeconometricevaluationsoftrainingprogramswithexperimentaldata.TheAmericanEconomicReview,76(4),604-620.

Landais,C.,Michaillat,P.,Saez,E.2015.Amacroeconomictheoryofoptimalunemploymentinsurance(WorkingPaper16526).NationalBureauofEconomicResearch.

Lee,D.S.2009.Training,wages,andsampleselection:estimatingsharpboundsontreatmenteffects.TheReviewofEconomicStudies,76(3),1071-1102.

Lemieux,T.,MacLeod,W.B.2000.Supplysidehysteresis:ThecaseoftheCanadianunemploymentinsurancesystem.JournalofPublicEconomics,78(1),139-170.

List,J.A.,RasulI.2011.Fieldexperimentsinlaboreconomics.HandbookofLaborEconomics,4(4),103-228.

Maguire,S.,Freely,J.,Clymer,C.,Conway,M.,Schwartz,D.2010.Tuningintolocallabormarkets:Findingsfromthesectoralemploymentimpactstudy.Public/PrivateVentures:NewYork.

ManpowerDemonstrationResearchCorporationBoardofDirectors.1980.Summaryandfindingsofthenationalsupportedworkdemonstration.BallingerPublishingCompany:Cambridge,MA.

126

Meyer,B.D.1995.LessonsfromtheUSunemploymentinsuranceexperiments.JournalofEconomicLiterature,33(1),91-131.

Mihaly,K.,MaCaffreyD.F.,StaigerD.O.,LockwoodJ.R.2013.Acompositeestimatorofeffectiveteaching.MetProject.[Availableat:http://www.metproject.org/downloads/MET_Composite_Estimator_of_Effective_Teaching_Research_Paper.pdf].

Miller,C.,VanDok,M.,Tessler,B.L.,Pennington,A.2012.Strategiestohelplow-wageworkersadvance:ImplementationandfinalimpactsoftheWorkAdvancementandSupportCenter(WASC)demonstration.ManpowerDemonstrationResearchCorp:NewYork.

MinnesotaDepartmentofJobsandTraining.1990.Re-employMinnesota.InJohnson,E.R.,eds.,Reemploymentservicestounemployedworkershavingdifficultybecomingreemployed(UnemploymentInsuranceOccasionalPaper90-2).U.S.DepartmentofLabor,EmploymentandTrainingAdministration,UnemploymentInsuranceService:Washington,DC.

Moffitt,R.A.1979.ThelaborsupplyresponseintheGaryexperiment.JournalofHumanResources,14(4),477-487.

Newey,W.,Powell,J.L.,Walker,J.R.1990.Semiparametricestimationofselectionmodels:Someempiricalresults.AmericanEconomicReview,80(2),324-28.

O'Leary,C.J.(2006).StateUIjobsearchrulesandreemploymentservices.MonthlyLaborReview,129(6),27–37.http://research.upjohn.org/jrnlarticles/3.

Oreopoulos,P.2007.Dodropoutsdropouttoosoon?Wealth,healthandhappinessfromcompulsoryschooling.JournalofPublicEconomics,91,2213-2229.

Pallais,A.2014.Inefficienthiringinentry-levellabormarkets.AmericanEconomicReview,104(11),3565-3599.

Palmer,J.L.,Pechman,J.A.1978.Welfareinruralareas:theNorthCarolina-Iowaincomemaintenanceexperiment.BrookingsInstitution:Washington,DC.

Perez-Johnson,I.,Q.Moore,andR.Santillano.2011.Improvingtheeffectivenessofindividualtrainingaccounts:Long-termfindingsfromanexperimentalevaluationofthreeservicedeliverymodels.FinalReport.Mathematica,Inc.

Pinto,R.2015.Selectionbiasinacontrolledexperiment:ThecaseofMovingtoOpportunity.Mimeo.,UniversityofChicago.

Poe-Yamagata,E.,J.Benus,N.Bill,H.Carrington,M.Michaelides,andT.Shen.2011.ImpactoftheReemploymentandEligibilityAssessment(REA)initiative.ImpaqInternational.

127

Powell,J.L.1984.Leastabsolutedeviationsestimationforthecensoredregressionmodel.JournalofEconometrics,25(3),303-325.

Robins,P.K.1985.AComparisonofthelaborsupplyfindingsfromthefournegativeincometaxexperiments.JournalofHumanResources,20(4)567-582.

Rothstein,J.2010.Teacherqualityineducationalproduction:Tracking,decay,andstudentachievement.QuarterlyJournalofEconomics,125(1),175-214.

Rothstein,J.2016.Revisitingtheimpactsofteachers.Unpublishedworkingpaper.http://eml.berkeley.edu/~jrothst/workingpapers/rothstein_cfr.pdf.

Schmieder,J.F.,vonWachter,T.,Bender,S.2012.Theeffectsofextendedunemploymentinsuranceoverthebusinesscycle:Evidencefromregressiondiscontinuityestimatesover20years.QuarterlyJournalofEconomics,127(2),701-752.

Schmieder,J.F.,vonWachter,T.,Bender,S.2016.Theeffectofunemploymentbenefitsandnonemploymentdurationsonwages.AmericanEconomicReview,106(3),739-777.

Schochet,P.Z.,Burghardt,J.A.2008.DoJobCorpsperformancemeasurestrackprogramimpacts?JournalofPolicyAnalysisandManagement,27(3),556-576.

Schochet,P.,Burghardt,J.,McConnell,S.2008.DoesJobCorpswork?Impactfindingsfromthenationaljobcorpsstudy.MathematicaPolicyResearch.

Smith,J.A.,Todd,P.E.2005.DoesmatchingovercomeLaLonde'scritiqueofnonexperimentalestimators?JournalofEconometrics,125(1),305-353.

Spiegelman,R.G.,O'Leary,C.J.,Kline,K.J.1992.TheWashingtonReemploymentBonusexperiment:Finalreport(UnemploymentInsuranceOccasionalPaper92-6).U.S.DepartmentofLabor:Washington,DC.

Springer,MatthewG.,DaleBallou,LauraS.Hamilton,Vi-NhuanLe,J.R.Lockwood,DanielF.McCaffrey,MatthewPepper,andBrianM.Stecher.2010.Teacherpayforperformance:ExperimentalevidencefromtheProjectonIncentivesinTeaching.Conferencepaper,NationalCenteronPerformanceIncentives.

SRIInternational.1983.FinalreportoftheSeattle-Denverincomeexperiment,VolumeI:Designandresults.U.S.DepartmentofHealthandHumanServices:Washington,DC.

Steinman,J.P.1978.TheNevadaclaimantplacementprogram.EmploymentSecurityResearch,NevadaEmploymentSecurityDepartment.

128

Todd,P.E.,Wolpin,K.I.2006.AssessingtheimpactofaschoolsubsidyprograminMexico:Usingasocialexperimenttovalidateadynamicbehavioralmodelofchildschoolingandfertility.AmericanEconomicReview,96(5),1384-1417.

USDepartmentofHealth,Education,andWelfare.1976.Summaryreport:Ruralincomemaintenanceexperiment.GovernmentPrintingOffice:Washington,DC.

Vytlacil,E.2002.Independence,monotonicity,andlatentindexmodels:Anequivalenceresult.Econometrica,70(1),331-341.

Walters,C.2014.Inputsintheproductionofearlychildhoodhumancapital:EvidencefromHeadStart(WorkingPaper20639).NationalBureauofEconomicResearch.

Watts,H.W.,Rees,A.,1977a.TheNewJerseyIncomeMaintenanceExperiment,Vol.II:Laborsupplyresponses.AcademicPress:NewYork.

Watts,H.W.,Rees,A.,1977b.TheNewJerseyIncomeMaintenanceExperiment,Vol.III:Expenditures,health,andsocialbehavior,andthequalityoftheevidence.AcademicPress:NewYork.

Woodbury,S.A.,Spiegelman,R.G.1987.Bonusestoworkersandemployerstoreduceunemployment:RandomizedtrialsinIllinois.TheAmericanEconomicReview,77(4),513-530.

TargetPopulation

PrimaryIntervention

SecondaryIntervention

ExperimentTitleStartDate

Cost(nominal$)

SampleSize Treatment FundingSource OutcomesofInterest

(1)

Totalfamilyincomenotexceeding150percentofthepoverty

level

Negativeincometax

NewJerseyIncomeMaintenanceExperiment

1968 $7,800,000725-Treatment632-Control1,357-Total

Eightcombinationsofincomeguaranteesandtaxratesonotherincome. OEO (1)Reductioninworkeffortand(2)

Lifestylechanges

(2)Rural,low-incomefamilies

Negativeincometax

RuralIncomeMaintenanceExperiment

1970 $6,100,000269-Treatment318-Controls587-Total

Fivenegativeincometaxplans.TheFordFdn.,OEOOfficeofEconomic

Opportunity

(1)Workbehavior;(2)Health,school,andothereffectsonpoorchildren;and(3)Savingsand

consumptionbehavior

(3)

Familyearninglessthan$11,000

in1971dollars

Negativeincometax

Vocationaltraining

Seattle-DenverIncome

MaintenanceExperiment

1970 $77,500,000

1,801-Treatment1946-Treatment21,012-Treatment31,041-

Control

Twotypesoftreatment:anegativeincometaxplanandasubsidytovocationaltraining. HEW,HHS

(1)Effectsonlaborsupply;(2)Martialstability;and(3)Other

lifestylechanges.

(4)

Blackfamilieswithatleastonechild

undertheageof18

Negativeincometax

GaryIncomeMaintenanceExperiment

1971 $20,300,0001,028-Treatment771-Control1,799-Total

Fourcombinationsofguaranteeandtax. HEW

(1)Employment;(2)Schooling;(3)Infantmortalityandmorbidity;(4)Educationalachievement;and(5)

Housingconsumption

(5)

One-andtwo-parentfamiliesreceivingAFDC

Earnedincomedisregard

CaliforniaWorkPaysDemonstrationProgram(CWPDP)

1993 $4,500,000

6,278-Treatment13,471-Treatment23,276-Control11,695-Control214,720-Total

ThetreatmentinvolvedchangingtwoprovisionsoftheAFDCprogram.The"$30andone-third"provisionappliedtoallAFDCfamiliesand

allowedwelfarerecipientstokeepthefirst$30andone-thirdoftheremainingwagesbeforewelfaregrantdeterminationsweremade.

However,itexpiredaftertherecipienthadbeenintheprogramforfourmonths,andthere-afterdollar-for-dollarreductionsingrantoccurredforeverydollarofearnings.Underthe100-hourrule,whichappliedonlytotwo-parentfamilies,the

totalworkhourspermonthfortheprimarywageearnercouldnotexceed100hourswithoutlossofeligibility.Experimentalsreceivedawaiverofthetimelimitonthe$30andone-thirdincomedisregard,andawaiverofthe100-hourrule.

However,thecashgrantsofexperimentalswerereducedby8.5percent.ControlsweresubjecttothegeneralAFDCrules,withexpiringdisregards,ineligibilityafter100hours,andhigherbenefits.

CADeptofSocialServices

(1)Employment;(2)Earnings;and(3)Welfarereceipt

Table1:DetailsonSelectedRandomizedControlledTrialsofWelfareProgramsandOtherLaborSupplyIncentivesforLow-IncomeWorkersintheUnitedStates

(6)Familieson

AFDCEarnedincomedisregard

Individualjobsearch

assistanceCase

management

FloridaFamilyTransitionProgram

(FTP)1994 $11,200,000

1,400-Treatment1,400-Control2,800-Total

Limitedwelfarebenefitsunless"job-ready",enhancedearningsdisregard,andintensivecase

management

FLDeptofChildrenandFamilies

USDepartmentofHealthandHuman

Services

(1)Earnings;(2)Welfarebenefitreceipts;and(3)Outcomesor

children

(7)

AFDCrecipientand

recentapplicantfamilies

Reemploymentbonus


JobsearchinventiveChildcareservices

MinnesotaFamilyInvestment

Program(MFIP)1994 $5,090,300

5,275-Treatment11,933-Treatment25,634-Treatment31,797-Control14,639-Total

MFIPprovideda20percentgrantincreasewhenrecipientsbecameemployed,increasedthelevelofincomethatwouldbedisregardedingrant

calculation,anpaidthechildcaresubsidydirectlytocaregiver.Two-parentfamilieswerenot

subjecttoworkhistoryrequirementsortothe100-hourrule.Bothsingle-parentandtwo-parent

familiesassignedtoMFIPweresubjecttomandatoryparticipationinemploymentservices.

RulesandproceduresweresimplifiedbycombiningFoodStamps,AFDC,andMinnesota'sFamilyGeneralAssistance(FGA)toformasinglecashbenefitprogram.SubjectsassignedtotheMFIPincentives-onlygroupreceivedidenticalbenefitsasMFIP,butwerenotrequiredtoparticipateintrainingservices.Twoother

groups.

MNDeptofHumanServices;FordFdn.;HHS;USDepartmentofAgriculture;CharlesStewartMottFdn.;AnnieECaseyFdn.;McKnightFdn.;

NorthwestAreaFdn.

(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Totalfamilyincome;and(5)Othermeasuresof

childandfamilywell-being

(8)AFDC

recipients

EarnedincomedisregardTimelimit

JobsearchincentivesVocationaltraining

ConnecticutJobsFirst

1996 $5,400,0002138-Treatment1821-Control3959-Total

EarningsdisregardedbelowthefederalpovertylevelandrequiredtoparticipateinJobSearch

SkillsTraining.

CTDeptofSocialServices

(1)Employment;(2)Earnings;(3)Benefitreceipt;and(4)Othermeasuresofchildwell-being

(9) UIclaimants Reemploymentbonus

IllinoisUnemployment

InsuranceIncentiveExperiment

1984 $800,000

4,186-Treatment(claimants)

3,963-Treatment(employers)3,963-Control12,112-Total

Unemployedwereoffereda$500bonusiffoundajobwithin11weeksandhelditfor4months.

ILDeptofEmploymentSecurity;WEUpjohnInstituteforEmployment

Research

(1)Reductionsinunemploymentspellsand(2)Netprogramsavings.


Jobsearchworkshop

PennsylvaniaReemployment

BonusDemonstration

1988 $990,00014,086-Treatment3,392-Control17,478-Total

Fivecombinationsofbonusamountandqualificationperiod.

DOL(1)UIreceipt;(2)Employment;and

(3)Earnings


WashingtonStateReemployment

BonusExperiment1988 $450,000


6variationsofreemploymentbonusamountandqualificationperiods.

AlfredPSloanFdn.USDOL,ETA

(1)Weeksofinsuredunemploymentand(2)UIreceipt

Sources:(1)KershawandFair,1976;WattsandRees,1977aand1977b;(2)USDepartmentofHealth,Education,andWelfare1976;PalmerandPechman,1978;(3)SRIInternational,1983;(4)Kehrer,McDonald,andMoffit,1980;(5)Becerra,Lew,Mitchell,andOno,1998;(6)Bloom,Kemple,Morris,Scrivener,Verma,andHendra,2000;(7)Knox,Miller,andGennetian,2000;(8)Bloom,Scrivener,Michalopoulos,Morris,Hendra,Adams-Ciardullo,andWalter,2002;(9)WoodburyandSpiegelman,1987;(10)Corson,Decker,Dunstan,andKerachsky,1991;(11)Spiegelman,O'Leary,andKline,1992.

Abbreviations:DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=Foundation;OEO=OfficeofEconomicOpportunity;HEW=USDepartmentofHealth,Education,andWelfare;HHS=USDepartmentofHealthandHumanServices.

Target

Population

Primary

Intervention

Secondary

InterventionExperimentTitle

Start

Date

Cost

(nominal$)SampleSize

T

o

t

a

Treatment FundingSource OutcomesofInterest

(1)

AFDC

recipients,ex-

offenders,

substance

abusers,and

highschool

dropouts

WorkexperienceNationalSupportedWorkDemonstration

(NSWD)1975 $82,400,000


Employmentinastructuredworkexperienceprograminvolvingpeergroupsupport,agraduatedincreaseinworkstandards,andclosesympatheticsupervision,for12to18months.

DOL,ETA;DOJ;LawEnforcementAssistance

Administration;HHS;NationalInstituteonDrugAbuse;HUD;USDepartmentofCommerce;

FordFdn.

(1)Increasesinpost-treatmentearnings;(2)Reductionsincriminalactivity;(3)Reductionsintransferspayments;and(4)Reductionsindrug

abuse

(2) AFDCrecipients WorkexperienceAFDCHomemaker--HomeHealthAideDemonstrations

1983 $8,000,0004,750-Treatment4,750-Control9,500-Total

ExperimentalAFDCsubjects(trainees)receivedafour-toeight-weektrainingcoursetobecomeahomemaker-homehealthaide,followedbyayearof

subsidizedemployment.Controlsubjectsdidnotreceivethistraining,nordidthey

receivesubsidizedemployment.

HealthCareFinancingAdministration

(1)Employment;(2)Earnings;and(3)AFDCandfoodstamppaymentsand

receipt

(3)

EligibleJob

Training

PartnershipAct

TitleIIadults

andout-of-

schoolyouth

VocationaltrainingGeneraleducationWorkexperienceOn-the-job-training

Individualjobsearchassistance

NationalJobTrainingPartnershipAct(JTPA)

Study1987 $23,000,000 20,602

Classroomtraining,on-the-jobtraining,jobsearchassistance,basiceducation,and

workexperience.DOL

(1)Earnings;(2)Employment;(3)Welfarereceipt;and(4)Attainmentof

educationalcredentialsandoccupationalcompetencies

(4) AFDCrecipients

VocationaltrainingGeneraleducationWorkexperience

Individualjobsearchassistance

GreaterAvenuesforIndependence(GAIN) 1988

24,528-Treatment8,223-Control32751-Total

basiceducation,jobsearchactivities,assessments,skillstraining,andwork

experience.

CaliforniaDepartmentofSocialServices(CDSS)

(1)Participationinemployment-relatedactivities;(2)Earnings;(3)Welfarereceipt;and(4)Employment

(5)Allrecipientsof

ADC(Ohio's

AFDCprogram)

WorkexperienceGeneraleducation

Individualjobsearchassistance JOBS 1989 $3,000,000 24,120-Treatment

4,371-Control

Mandatoryemploymentandtrainingservices,whichincludedbasicandpost-secondaryeducation,communityworkexperience,andjobsearchassistance.

OHDeptofHumanServices (1)Employment;(2)Earnings;and(3)Welfarereceipt

(6)

low-income,

disadvantaged

workersand

jobseekers

Vocationaltraining Individualjobsearchassistance

SectoralEmploymentImpactStudy 2003 1,286-Total

Industry-specifictrainingprogramsthatpreparedunemployedandunderskilledworkersforskilledpositionsandconnectthemwithemployersseekingtofillsuchvacancies.Sectoralprogramsemployvariousapproachesdependingontheorganizationleadingtheeffortandlocal

employers’needs.

CharlesStewartMottFdn. (1)Earnings;(2)Employment;and(3)Qualityofjobs

Table2:DetailsonSelectedRandomizedControlledTrialsofProgramsOfferingJobTrainingandWorkExperienceforLow-IncomeIndividualsintheUnitedStates

(7)low-wageworkers

Vocationaltraining

On-the-jobtrainingCaseManagement

WorkAdvancementand

SupportCenter(WASC)

Demonstration2005

1,176-Dayton

971-SanDiego

705-Bridgeport

2,852-Total

Theprogramofferedparticipating

workersintensiveemploymentretention

andadvancementservices,including

careercoachingandaccesstoskills

training.Italsoofferedthemeasieraccess

toworksupports,inanefforttoincrease

theirincomesintheshortrunandhelp

stabilizetheiremployment.Finally,both

serviceswereofferedinonelocation—in

existingOne-StopCareerCenterscreated

bytheWorkforceInvestmentAct(WIA)of

1998—andbyco-locatedteamsof

workforceandwelfarestaff.

StateofOhio;CountyofSan

DiegoHealthandHuman

ServicesAgency;DOLETA;U.S.

DepartmentofAgriculture,Food

andNutritionService;HHS;

AdministrationforChildrenand

Families;FordFdn.;Rockefeller

Fdn.;AnnieE.CaseyFdn.;David

andLucilePackardFdn.;The

WilliamandFloraHewlettFdn.;

JoyceFdn.;JamesIrvineFdn.;

CharlesStewartMottFdn.;

RobertWoodJohnsonFdn.

(1)Employmentand(2)Earnings

(alongwithmanyotheroutcome

measures)

(8)

schooldropoutsaged17-21years

Generaleducation

VocationaltrainingIndividualjob

searchassistance

JOBSTART 1985 $6,200,000

1,163-Treatment

1,149-Control

2,312-Total

Educationandvocationaltraining,

supportservices,andjobplacement

assistance.

DOL;RockefellerFdn.;FordFdn.;

CharlesStewartMottFdn.;

WilliamandFloraHewlettFdn.;

morefoundations.

(1)Educationalattainment;(2)

Employment;(3)Earnings;and(4)

Welfarereceipt

(9) 16-24yearolds Generaleducation

Vocationaltraining

Healthcare

services

Housingservices

NationalJobCorpsStudy 1994 $21,587,202

9,409-Treatment

5,977-Control

15,386-Total

TreatmentgroupallowedtoenrollinJob

Corpsgroup.JobCorpscentersprovide

vocationaltraining,academicinstruction,

healthcare,socialskillstraining,and

counseling.

DOL,

ETA

(1)Employment;(2)Earnings;(3)

Educationandjobtraining;(4)

Welfarereceipt;(5)Criminal

behavior;(6)Druguse;(7)Health

factors;and(8)Householdstatus

Sources:(1)MDRCBoardofDirectors,1980;(2)Bell,Burstein,andOrr,1987;(3)Bell,Bloom,Cave,Doolittle,andOrr,1994;Bloom,Orr,Bell,Cave,Doolittle,Lin,andBos,1997;(4)Freedman,Friedlander,Riccio,1994;(5)Fein,Beecroft,andBlomquist,1994;

(6)Maguire,Freely,Clymer,Conway,andSchwartz,2010;(7)Miller,VanDok,Tessler,andPennington,2012;(8)Cave,Bos,Doolittle,andToussaint,1993;(9)Burghardt,Schochet,McConnell,Johnson,Gritz,Glazerman,Homrighausen,andJackson,2001.

Abbreviations:DOJ=USDepartmentofJustice;HHS=USDepartmentofHealthandHumanServices;HUD=USDepartmentofHousingandUrbanDevelopment;DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=

Foundation.

TargetPopulation PrimaryIntervention

SecondaryInterventions ExperimentTitle Start

DateCost

(nominal$) SampleSize Treatment FundingSource OutcomesofInterest

(1)

Single-parentheadsofhouseholdwhowererequiredtoparticipateinthe

program(recipientsofAFDC)

JobClub GeneraleducationVocationaltraining

ProjectIndependence--Florida 1990 $3,600,000


TheexperimentalgroupwaseligibletoreceiveProjectIndependenceservicesandwassubjectto

aparticipationmandate.Servicesincludedindependentjobsearch,jobclub,assessment,basiceducation,andtraining.Thecontrolgroupwasnoteligiblefortheseservicesandwasnot

subjecttoaparticipationmandate.

FloridaDepartmentofHealthandRehabilitative

ServicesFordFdn.

USDepartmentofHealthandHumanServices

(1)Employment;(2)Earnings;and(3)AFDCreceipt

(2) Single-parentwelfarerecipients

JobClubCaseManagement

GeneraleducationVocationaltraining

NationalEvaluationofWelfare-to-Work

Strategies(NEWWS)1991 $31,700,000 44,569-Total

Elevenprograms,broadlydefinedaseitheremployment-focusedoreducation-focused,were

testedinsevensitesacrosstheUS.

(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Cost-effectiveness;

and(5)Childwell-being

(3) Familiesonwelfare Individualjobsearchassistance


Workexperience

IndianaWelfareReformEvaluation 1995 $23,200,000


Experimentalsweresubjectnewwelfarereformpolicies:assistedjobsearch,broadermandatoryworkparticipation,earnedincomedisregard,

timelimitsforcaseassistance,arevisedsystemofchildcareprovision,familybenefitcap,andparentalresponsibility(suchasimmunizingchildren).Controlscontinuedunderthe

traditionalAFDCpolicies

IndianaFamilyandSocialServicesAdministration


(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Income;(5)Healthinsurance;and(6)Parental

responsibility

(4)

Single-parent(AFDC-FG)andtwo-parent(AFDC-U)welfarefamiliesinLosAngelesCounty

JobClubIndividualjobsearch

assistancejobsearchworkshop

LAJobs-FirstGAINEvaluation 1995 $29,900,000


MembersofthetreatmentgroupwereenrolledinJobs-FirstGAIN.Thesesubjectswererequiredto

participateinatleastoneofthejobsearchactivities,includingjobclubsandother

informationalservicesandjobsearchtrainingsessions.ExperimentalswerealsoexposedtoJobs-FirstGAIN'sintensivework-firstmessage.Sanctionswereimposed,usuallyintheformofpartialreductionsinwelfarebenefits,forfailuretoparticipate.ControlswerenotexposedtoanyofJobs-FirstGAIN'sservices,theintensivework-firstmessage,orsanctions.Controlscouldstillreceiveassistanceformotheragenciesandwere

subjecttoexistingwelfarerules.

LosAngelesDepartmentofPublicSocialServices


FordFdn.

(1)Employment;(2)Earnings;(3)Welfarebenefits;(4)Outcomesforchildren;and(5)IncrementaleffectscomparedwithpreviousLAGAIN

program

(5) UIclaimants Individualjobsearchassistance Vocationaltraining

NevadaClaimantPlacementProgram

(NCPP)1977 3,500

Morestaffattentionandmorereferrals,weeklyinterviewsandeligibilitychecks,allservicesfromsameES/UIteamwhichcoordinatedtheirefforts

(1)Weeksofbenefits;(2)Earnings;(3)Enforcementofworksearchrules;(4)Jobsearches;and(5)Referralsand

placements

Table3:DetailsonSelectedRandomizedControlledTrialsofJobSearchAssistanceProgramsforLow-IncomeIndividualsandUnemployedWorkersintheUnitedStates

(6) UIclaimantsJobsearchincentives

Individualjobsearch

assistance

ClaimantPlacementand

WorkTestDemonstration1983 $225,000

1,485-Treatment1

1,493-Treatment2

1,666-Treatment3

1,277-Treatment4

Jobsearchandplacementservices

USDepartmentofHealthand

HumanServices

FordFdn.

(1)Employmentand(2)UIpayments

reductions

(7)

UIclaimantsindefinitely

separatedfrommostrecentjob

Individualjobsearch

assistance

WisconsinEligibility

ReviewPilotProject(ERP)1983 5000

6-hourjobsearchworkshopconductedbyES

staff;alsotried3-hourjobsearchworkshop

(1)Weeksofbenefits;(2)Earnings;

(3)Enforcementofworksearchrules;

(4)Jobsearches;and(5)Referralsand

placements

(8) Unemployed

Casemanagement

Individualjobsearch

assistance

Jobsearchworkshop

ReemployMinnesota

(REM)1988 $835,000

4,212-Treatment

unknown-Control

(roughly10times

treatment)

Morepersonalizedandintensiveunemployment

insurance(UI)services,includingcase

management,intensivejobsearchassistanceand

jobmatching,claimanttargetingforspecial

assistance,andajob-seekingskillsseminar.The

controlgroupreceivedregularUIservices.

UnemploymentInsurance

ContingentAccountofthe

MinnesotaDepartmentof

JobsandTraining

(1)DurationofUIbenefitsand(2)

AmountofUIbenefits

(9) UIclaimants Individualjobsearch

assistanceVocationaltraining

KentuckyWorkerProfiling

andReemployment

Services(WPRS)

Experiment

1994 $15,000

1,236-Treatment

745-Control

1,981-Total

Structuredjobsearchactivities,employment

counseling,andretraining

KentuckyDepartmentof

EmploymentServices

(1)Earnings;(2)Lengthofbenefit

receipt;and(3)AmountofUIbenefits

received

(10) UIclaimants Alternativework

searchpolicies

MarylandUnemployment

InsuranceWorkSearch

Demonstration

1994 $250,000

3,510-Treatment1

3,455-Treatment2

3,680-Treatment3

3,400-Treatment4

4,812-Control1

4,901-Control2

23,758-Total

4differentruleschangestoMarylandUI

eligibilityrulesUSDOLETA

(1)UIpaymentsintermsofweeksand

dollars;(2)Continuingeligibility;(3)

Employment;and(4)Earnings

(11) UIclaimantsIndividualjobsearch

assistanceCasemanagement

VocationaltrainingReemploymentandEligibilityAssessment

(REA)2013

(1)CurrentREAProgram:assistance--definedastheprovisionoflabormarketinformation,

developinganindividualreemploymentplan,areferraltoreemploymentservices,anddirect

provisionofreemploymentservices+enforcement(seebelow)

(2)EnforcementOnly:therequirementthatclaimantsappearfortheREAmeetingandthatREAprogramstaffverifyclaimants’eligibilityandtheirparticipationinworksearchactivities,withreferraltoadjudicationandpossiblesuspensionofUIbenefitsforthosewhodonotparticipate

USDOLETA(1)UIbenefitreceipt;(2)

Employment;and(3)Earnings

Sources:(1)Kemple,Friedlander,andFellerath,1995;(2)Hamilton,Freedman,Gennetian,Michalopoulos,andWalter,2001;(3)Beecroft,Lee,Long,Holcomb,Thomson,Pindus,O'Brien,andBernestin,2003;(4)Freedman,Knab,Gennetian,andNavarro,2000;(5)Steinman,1978;(6)Johnson,Pfiester,West,andDickinson,1984;Corson,Long,andNicholson,1984;(7)HerremandSchmidt,1983;Jaggers,1984(8)MinnesotaDepartmentofJobsandTraining,1990;(9)Black,Smith,Berger,andNoel,2003;(10)Klepinger,Johnson,Joesch,andBenus,1997;(11)Klerman,Minzner,Harkness,Mills,Cook,andSavidge-Wilkins,2013.

Abbreviations:DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=Foundation.

social experiments in the labor market...2016/07/06 · experiments have addressed core labor...

Documents