social experiments in the labor market...2016/07/06 · experiments have addressed core labor...
TRANSCRIPT
1
SocialExperimentsintheLaborMarket
JesseRothstein*
UniversityofCaliforniaBerkeleyandNBER
TillvonWachter
UniversityofCaliforniaLosAngelesandNBER
ChapterpreparedfortheHandbookofFieldExperiments.
July2016
Abstract
Large-scalesocialexperimentswerepioneeredinlaboreconomics,andarethebasisformuchofwhatweknowabout topics ranging from theeffectof job training toincentivesforjobsearchtolaborsupplyresponsestotaxation.Randomassignmenthas provided a powerful solution to selection problems that bedevil non-experimentalresearch.Nevertheless,manyimportantquestionsaboutthesetopicsrequire going beyond randomassignment. This applies to questions pertaining tobothinternalandexternalvalidity,andincludeseffectsonendogenouslyobservedoutcomes,suchaswagesandhours;spillovereffects;siteeffects;heterogeneity intreatmenteffects;multipleandhiddentreatments;andthemechanismsproducingtreatment effects. In this Chapter, we review the value and limitations ofrandomized social experiments in the labor market, with an emphasis on thesedesign issues and approaches to addressing them. These approaches expand therange of questions that can be answered using experiments by combiningexperimental variation with econometric or theoretical assumptions. We alsodiscusseffortstobuildthemeansofansweringthesetypesofquestionsintotheexante design of experiments. Our discussion yields an overview of the expandingtoolkitavailabletoexperimentalresearchers.
*Contact:[email protected],tvwachter@econ.ucla.edu.WethankBenSmithandAudreyTiewforsterlingresearchassistance,andAngusDeaton,LarryKatz,JeffSmith,andconferenceparticipantsforhelpfulcomments.
2
I. Introduction.....................................................................................................................3
II. WhatareSocialExperiments?HistoricalandEconometricBackground10a. APrimerontheHistoryandTopicsofSocialExperimentsintheLabor
Market....................................................................................................................................10b. Socialexperimentsasatoolforprogramevaluation............................................15
i. Thebenchmarkcase:Experimentswithperfectcompliance..............................16ii. Imperfectcomplianceandthelocalaveragetreatmenteffect............................19
c. Limitationsoftheexperimentalparadigm...............................................................21i. SpilloverEffectsandtheStableUnitTreatmentValueAssumption................22ii. Endogenouslyobservedoutcomes.................................................................................22iii. SiteandGroupEffects...........................................................................................................23iv. TreatmentEffectHeterogeneityandExternalValidity..........................................23v. HiddenTreatments................................................................................................................24vi. MechanismsandMultipleTreatments..........................................................................25
d. Quasi-experimentalandStructuralResearchDesigns.........................................25III. Amorethoroughoverviewoflabormarketsocialexperiments...............26
a. LaborSupplyExperiments.............................................................................................27b. Trainingexperiments.......................................................................................................34c. JobSearchAssistance.......................................................................................................44d. PracticalAspectsofImplementingSocialExperiments........................................51
IV. GoingBeyondTreatment-ControlComparisonstoResolveAdditional
DesignIssues..........................................................................................................................54a. SpillovereffectsandSUTVA...........................................................................................56
i. Addressingtheissueexpost..............................................................................................57ii. Addressingtheissueexantethroughthedesignoftheexperiment...............60
b. Endogenouslyobservedoutcomes..............................................................................62i. Addressingtheissueexpost..............................................................................................64
Parametricselectioncorrections......................................................................................................65Non-andsemi-parametricselectioncorrections......................................................................66
ii. Addressingtheissueexantethroughthedesignoftheexperiment...............72c. Siteandgroupeffects.......................................................................................................74
i. Addressingtheissueexpost..............................................................................................76ii. Addressingtheissueexantethroughthedesignoftheexperiment...............83
d. Treatmenteffectheterogeneityandexternalvalidity..........................................84i. Addressingtheissueexpost..............................................................................................85ii. Addressingtheissueexantethroughthedesignoftheexperiment...............90
e. Hiddentreatments............................................................................................................94i. Addressingtheissueexpost..............................................................................................95ii. Addressingtheissueexantethroughthedesignoftheexperiment...............97
f. Mechanismsandmultipletreatments........................................................................98i. Addressingtheissueexpost..............................................................................................99ii. Addressingtheissueexantethroughthedesignoftheexperiment.............110
V. Conclusion...................................................................................................................112
3
I. Introduction
Thereisaverylonghistoryofsocialexperimentationinlabormarkets.
Experimentshaveaddressedcorelabormarkettopicssuchaslaborsupply,job
search,andhumancapitalaccumulation,andhavebeencentraltotheacademic
literatureandpolicydiscussion,particularlyintheUnitedStates,formanydecades.
Bymanyaccounts,thefirstlarge-scalesocialexperimentwastheNewJersey
IncomeMaintenanceExperiment,initiatedin1968bytheU.S.OfficeofEconomic
Opportunitytotesttheeffectofincometransfersandincometaxratesonlabor
supply.Wheremanysubsequentexperimentshavebeendesignedtoevaluatea
singleprogramortreatmenteach,theIncomeMaintenanceExperimentwas
intendedinsteadtomapoutaresponsesurface.Participantswereassignedtoa
controlgrouportooneofeighttreatmentarmsthatvariedintheincomeguarantee
toafamilythatdidnotworkandtherateatwhichthiswastaxedawayasearnings
rose.Threefollow-upexperiments–inruralNorthCarolinaandIowa;inGary,
Indiana;andinSeattleandDenver–withvaryingbenefitlevelsandtaxrates(and,
inSeattleandDenver,across-cuttingsetofcounselingandtrainingtreatments)
werebegunbeforedatacollectionfortheNewJerseyexperimentwascomplete.
Otherearlylabormarketexperimentsexaminedtheeffectsofjobsearch
encouragementforUnemploymentInsurancerecipients;jobtrainingandjobsearch
programs;subsidizedjobsforthehard-to-employ;andprogramsdesignedtopush
welfarerecipientsintowork(GreenbergandRobins,1986;Gueron,thisvolume).
Thesetopicshavebeenreturnedtorepeatedlyintheyearssinceasresearchers
4
havesoughttotestnewprogramdesignsortobuildonthelimitationsofearlier
research.Therehavealsobeenmanysmaller-scaleexperiments,onbonuspay
schemes,managementstructure,andotherfirm-levelpolicies.1
Fromthebeginning,theuseofrandomassignmentexperiments(alsoknown
asrandomizedcontrolledtrials,orRCTs)hasbeencontroversialinlabor
economics.2Theprimary,powerfulappealofRCTsisthattheysolvetheassignment,
orselection,probleminprogramevaluation.Innon-experimentalstudies(also
knownas“observational”studies),programparticipantsmaydifferinobservedand
unobservedwaysfromthosewhodonotparticipate,andeconometricadjustments
forthisselectionrelyonunverifiable,oftenimplausibleassumptions(Lalonde1986;
FrakerandMaynard1987;thoughseealsoHeckmanandHotz,1989).Withawell-
executedrandomizationstudy,however,thetreatmentandcontrolgroupsare
comparablebydesign,makingitstraightforwardtoidentifytheeffectofthe
treatmentunderstudy.
Butsetagainstthisveryimportantadvantageareanumberofdrawbacksto
experimentation.Earlyon,itwasrecognizedthatRCTscanbeveryexpensiveand
hardtoimplementsuccessfully.Forexample,itisnotalwayspossibletoensurethat
everyoneassignedtoreceiveatreatmentreceivesafulldose,whilethoseassigned
tothecontrolgroupreceivenone,thoughthisistheexperimentalideal.Sometimes
itisnotfeasibletocontrolparticipants’behavior,andmanyparticipantsdeviate
1Weomithereauditstudiesaimedatuncoveringdiscriminationinthelabormarketandelsewhere(e.g.,BertrandandMullainathan2004;Kroft,Lange,andNotowidigdo2013;Farber,Silverman,andvonWachter2015).ThesearecoveredbyBertrandandDuflo,elsewhereinthisvolume.2ForrecentcriticismsofrelianceonRCTswithparticularrelevancetolabormarketstudies,seeDeaton(2010)andHeckman(2010).SeealsoHeckmanandSmith(1995).
5
fromtheirintendedtreatmentassignments.Inothercases,ethical,political,or
operationalconsiderationsmakeitundesirabletolimitaccesstoalternative
treatments.Althoughthiscanbepartlyaddressedwithinthebasicexperimental
paradigm,itdoeslimitwhatcanbelearned.
Moregenerally,whilerandomassignmentsolvestheassignmentproblem,it
aloneisnotsufficienttoresolveotherproblemsthatresearchersoftenface.Many
questionsofinterestcanbeansweredonlywithsomethingmorethanthefamiliar
two-armedrandomizedcontroltrial–amorecomplexexperimentaldesign,the
augmentationofexperimentaldatawithadditional,non-experimentaldata,
theoreticallygroundedassumptions,oracombinationofthese.Weconsidera
numberofsuchquestionsinthischapter.Theseinclude:
• Questionsaboutimpactsonendogenouslyobservedoutcomes.Considerthe
effectofjobtrainingonwages.Becausewagesareobservedonlyforthose
whohavejobs,andbecausetrainingmayaffectthelikelihoodofworking,the
contrastinmeanwagesbetweenrandomlyassignedtreatmentandcontrol
groupsdoesnotcompareliketolikeandthusdoesnotsolvetheassignment
problemforthisoutcome.
• Questionsaboutspilloversandmarket-levelimpacts.Whenoneindividual’s
outcomedependsonothers’treatmentassignments,experimentalestimates
oftreatmenteffectscanbemisleadingaboutaprogram’soveralleffect.Inthe
contextoflabormarketprograms,anincreaseinjobsearcheffortbya
treatmentgroupmaylowerthecontrolgroup’sjob-findingchances,leading
toanoverstatementoftheprogram’stotaleffect(whichwillitselfdepend
6
importantlyonthescaleatwhichtheprogramisimplemented).Similar
issuescanariseifsubjectscommunicatewitheachother,leadingtoadilution
intreatmentcontrastswhenaccesstoinformationispartofthetreatment.
• Questionsaboutheterogeneityoftreatmenteffects.Experimentshavelimited
abilitytoidentifyheterogeneityoftreatmenteffects,especiallyif
heterogeneityisnotfullycharacterizedbywell-definedobservable
characteristics.Thisisoftenoffirst-orderimportance,asinmanycasesthe
relevantquestionisnotwhethertoofferaprogram(e.g.,jobtraining)butfor
whomtomakeitavailable,orwhichversionsoftheprogramaremost
effective(andwhy).
• Questionsaboutgeneralizability.Whileinidealcasesexperimentshavehigh
internalvalidityfortheeffectofthespecificprogramunderstudyonthe
specificexperimentalpopulation,inthesettinginwhichitisstudied,they
mayhavelimitedexternalvalidityforgeneralizationstootherlocations,to
otherprograms(oreventootherimplementationsofthesameprogram),or
tootherpopulations.Forexample,areemploymentbonusprogrammayhave
averydifferenteffectinafull-employmentlocaleconomythanwhenthe
localareaisinarecession,orthesameprogramofferedindifferentsitesmay
havedramaticallydifferenteffectsduetovariationinlocalprogram
administrationorcontext.
• Questionsaboutmechanisms.Manyquestionsofinterestinlabormarket
researchdonotreducetotheeffectsofspecific“treatments”onobserved
outcomes,butrelate,atbest,tothemechanismsbywhichthoseeffectsarise.
7
Forexample,animportantquestionfortheanalysisofunemployment
insuranceprogramsiswhethertheunemployedareliquidityconstrainedor
whethertheycanborroworsavetosmoothconsumptionoptimallyacross
periodsofemploymentandunemployment.Andimportantquestionsabout
thedesignofwelfareanddisabilitypolicyturnonwhetherobservednon-
employmentisduetohighdisutilityofworkortomoralhazard.Ineachcase,
wewanttodistinguishincomeandsubstitutioneffects,adistinctionthatisin
generalnotidentifiedfromthesimpleeffectofatreatmentonanobserved
outcome.Carefullydesignedexperimentscanshedlightonthephenomenaof
interest,butmaynotbeableanswerthemdirectly.
Tobeclear,allofthesequestionsarethornyunderanymethodological
approach,andaregenerallynoeasiertoanswerinquasi-experimentalstudiesthan
inrandomizedexperiments.Onevocalgroupofcriticsofexperimentationpointsto
theimportanceofidentifyingthe“structural”parameters–afullcharacterizationof
programenrollmentdecisionsandthebehavioralprocessesthatleadtothe
observedoutcomes–thatdetermineprogramselectionandimpacts(see,e.g.,Keane
2010).Inprinciple,manyofthedesignissuesabovecouldindeedbeavoidedor
addressedwithestimatesoftheunderlyingstructuralparameters.Butthese
structuralparametersaredifficulttomeasure.So-calledstructuralmethods
generallytradeoffinternalvalidityinpursuitofmoreexternalvalidity,butastudy
thatfailstosolvetheassignmentproblemisunlikelytobeanymoregeneralizable
thanitisinternallyvalid.
8
Unfortunately,whileexperimentscansometimesbedesignedtoidentifya
fewkeystructuralparameters,oratleastimportantcombinationsofthem,itis
rarelypossibletodesignanexperimentthatdirectlyidentifiesallofthestructural
parametersofinterest.Thus,therecanbevalueincombiningthetwoparadigms.
Thisinvolvesimposinguntestableassumptionsabouttheprocessesofinterest,
whilestillrestingonexperimentation(orotherempiricalmethodsthatofferhigh
internalvalidity)wherepossible.Theadditionalassumptionscandramatically
enhanceexternalvalidityiftheyarecorrect,thoughiftheyareincorrect–andthisis
generallyuntestable–bothinternalandexternalvaliditysuffer.
Thecurrentfrontierforlabormarketresearch–asinotherfields–thus
involvescombiningthebestfeaturesofthetwoapproachestopermitanswersto
morequestionsthanareaddressedbysimpleexperiments,whileretainingatleast
someofthecredibilitythattheseexperimentscanprovide.
Inthischapter,wediscussavarietyofquestionscommoninlabormarket
researchthatrequirethissortofapproach.Wedistinguishtwobroadstrategiesfor
answeringthesequestionsusingexperimentaldata.First,onecanaugment
traditionalrandomizedexperimentsbyimposingadditionalstructure,either
economicoreconometric,afterthefact.Inmanycases,theamountofstructure
required,andthestrengthoftheadditionalassumptionsthatarenecessary,issmall
relativetothevalueoftheresultsthatcanbeobtained.Ourreviewgivesasnapshot
ofanexpandingtoolkitwithwhichresearcherscanaddressawiderrangeof
9
questionsbasedonvariationfromRCTs.3
Thesecondbroadstrategyistoaddressthelimitationsoftraditional
experimentsexante,viadesignoftheexperimentalinterventionorevaluationitself.
Inmanycases,cleverdesignchoices–multipletreatmentarms,carefullydesigned
stratification,orrandomizationbothacrossandwithingroups,forexample–can
allowforricherconclusionsthanwouldbepossibleviatraditionalexperiments.
Thissortofapproachhasalonghistory–indeed,theveryfirstlarge-scalesocial
experiments,theincomemaintenanceexperimentsofthelate1960sandearly
1970s,canbeseenasaversionofit.Butthependulumswungawayforalongtime,
andresearchershaveonlyrecentlybeguntoreturntoexperimentaldesignsthat
synthesizerandomexperimentalvariationwithmorestructuralmodeling.Recent
examplesofthisapproachincludeKling,Liebman,andKatz(2007)whouseitto
addresspotentialbiasesfromendogenousattrition,andCrepon,Duflo,Gurgand,
Rathelot,andZamora(2013),whoquantifytheimportanceofspillovers.Inour
view,approachesliketheserepresentthecurrentresearchfrontier.
Therestofthischapterproceedsasfollows.InSectionII,wegivebrief
overviewsofthehistoryofsocialexperimentsinthelabormarketandofthevalue
ofRCTsforsolvingselectionproblems,andsummarizepotentialdesignissuesthat
remainevenwithrandomassignment.InSectionIII,wereviewthetypesof
programsandquestionsthathavebeenanalyzed,theirmainfindings,andpractical
3Thisincludesanalysesofissuessuchasendogenouslyobservedoutcomes(e.g.,AhnandPowell1993,Grogger2005,Lee2009);hiddentreatments(e.g.,KlineandWalters2014,Feller,Grindal,Miratrix,andPage2014,Pinto2015);heterogenoustreatmenteffects(e.g.,KlineandWalters2014,HeckmanandVytlacil2005);andmultipletreatmentsandmechanisms(e.g.,CardandHyslop2005,Schmieder,vonWachter,andBender2016,DellaVigna,Lindner,ReizerandSchmieder2016).
10
challengesthatlabormarketexperimentsoftenconfront.SectionIVdiscusses
approachestoaddressingthedesignchallengesfromSectionIIandthereby
expandingtherangeofquestionsthatcanbeanswered.Wediscussbothexanteand
expostapproachestoresolving(oratleastameliorating)theissues.SectionVoffers
someconcludingcomments.
II. WhatareSocialExperiments?HistoricalandEconometricBackground
a. APrimerontheHistoryandTopicsofSocialExperimentsintheLabor
Market
Astheso-called“credibilityrevolution”hassweptoverempiricaleconomics
inthelastgeneration,theroleandstatusofexperimentalevidencehasgrown.Over
thesameperiod,thefieldofexperimentaleconomicshassegmented–Listand
Rasul(2011)andHarrisonandList(2004),forexample,drawcarefuldistinctions
betweensocialexperimentsandartefactual,natural,andframedfieldexperiments.
Briefly,socialexperimentstendtobeconductedatalargescaleandtofocusonthe
overallevaluationofpoliciesorprograms,oftenalreadyinplace.Bycontrast,the
varioustypesoffieldexperimentsaretypicallysmallerinscaleandaremorelikely
touseartificialtreatments(e.g.,behavioralgames)thatwouldnotcorrespond
directlytoanyspecificpolicybutaredesignedprimarilytouncoverparticular
behavioraltendenciesorparameters.
Althoughallofthemanyvarietiesofexperimentshavebeenusedtostudy
topicsrelatedtothelabormarket,thischapterfocusesonlarge-scalesocial
experiments,whichinourviewhavehadthelargestimpactonpolicy.
11
Thesocialexperiment/fieldexperimentdistinctioncorrespondsroughlyto
thedistinctiondrawnabovebetweenprogramevaluationandtheidentificationof
structuralparameters–socialexperimentsare,atroot,evaluationsofprogramsor
policies,wherefieldexperimentsaredesignedprimarilytouncoveroneormore
specificstructuralparameters.4Aswediscussedabove,thisdistinctionislessclear
thanitoncewas–scholarsareincreasinglydrawingonprogramevaluationsamples
tounderstandstructuralrelationships,andusingstructuralparameterstoinform
thedesignandinterpretationofprogramevaluations.Butwhilethedistinctionhas
beenblurred,ithasnotbeenobliterated,andnearlyallofthesocialexperiments
thatwediscussinthischapteraredesigned,atleastinpart,toevaluateprograms
thateitherhavebeenormightplausiblybeimplementedinroughlytheformusedin
theexperiment.
Another,relateddistinctionhastodowiththecommunitiesthatconductthe
differenttypesofexperiments.Socialexperimentsaretypicallyconductedatalarge
scalebyanorganizationthatspecializesinthis–historically,the“BigThree”players
(GreenbergandShroder2004)havebeenMathematica,theManpower
DemonstrationResearchCorporation(MDRC),andAbtAssociates–andhasbeen
hiredbyagovernmentagency(mostnotablyOPDR,theOfficeforPolicy
DevelopmentandResearchwithintheDepartmentofLabor’sEmploymentand
TrainingAdministration,andASPE,theAssistantSecretaryforPolicyand
EvaluationwithintheDepartmentofHealthandHumanServices)oralarge
4Klingetal.(forthcoming)refertoexperimentsaimedatunderstandingmechanismsratherthanatevaluatingprogramsas“mechanismexperiments.”Gueron(thisvolume)discussesthetensionbetweenprogramevaluationandunderstandingmechanismsinearlysocialexperiments.
12
foundation(e.g.,theFordFoundation)foraspecificstudy.Bycontrast,field
experimentsaremoreoftenoverseenbyindividualscholarsandtheirstudents,
perhapswiththecooperationofacompanyorgovernmentagencythatisnot
otherwisecloselyinvolvedinthedesign.
Thedifferencesinthecompositionandorganizationalstructureofsocial
experimentalandfieldexperimentalresearchteamsrelatetothescopeofthework
beingcarriedout.Aresearchteamimplementingasocialexperimentfacesa
numberofpracticalandimplementationchallengesthatarelargelyabsentfrom
laboratoryexperimentsandcloselyrelatedtypesoffieldexperiments.Researchers
rarelyhaveaccesstoasamplingframecorrespondingtothepopulationofinterest;
facepractical,ethical,andpoliticaldifficultiesinrandomlyassigningaccessto
treatment;havelimitedornocontrolovertreatmentalternativesthatcontrol
participantsmayobtainoroverthespecificimplementationofthetreatment,which
isoftenunderthecontrolofanagencyratherthantheexperimenter;andlackready
accesstooutcomemeasuresforuseinassessingtheprogram’simpact(oreventoa
well-definedsetofoutcomesofinterest).Addressingthesechallengesoftenrequires
alargestafftocollectpre-andpost-treatmentdata,tominimizeattritionbetween
surveywaves,andtomonitorboththerandomizationoftreatmentandthefidelity
oftreatmentdeliverytotheprogrammodel.Therequiredscaleisoftenoutofthe
reachofindividualresearchers.
Mostauthorsagreethatthefirstlarge-scalesocialexperimentinthelabor
marketwastheNewJerseyIncomeMaintenanceExperiment(hereafter,IME;thisis
alsoknownastheNewJerseyNegativeIncomeTaxexperiment),firstinitiatedin
13
1968andextendedinvariouswaysinotherlocationsoverthenextseveralyears.
Consistentwiththeabovedichotomy,thiswasalarge-scaleexperimentthatwas
initiatedbytheOfficeofEconomicOpportunity(OEO),thenanindependentagency
withintheFederalgovernmentthatplayedaleadroleintheWaronPoverty.Butin
otherwaysitmorecloselyresembleswhatwouldnowbecalledafieldexperiment,
albeitatamassivescale:Itwasfirstconceptualizedbyanindividualresearcher,
HeatherRoss,whoproposedittoOEO,anditwasdesignednottoevaluatea
specific,welldevelopedprogrambuttomapoutthesurfaceoflaborsupply
responsestoarangeoftaxparametersandtherebytouncoversemi-structural
economicparameters,theincomeandsubstitutioneffectsofchangesintaxrates.
NearlyallanalysesofIMEdatawentbeyondsimpletreatment-control
contrasts,usingthedatatoestimateparametricorsemi-parametriclaborsupply
models.5Thesemodelsoftenincorporatedcorrectionsfortheselectionintroduced
bynonparticipationthatreliedonstrongfunctionalformassumptions(e.g.,Tobits)
andinsomecasesalsorestedonstructuralspecificationsoftheresponseto
nonlineartaxschedules.Inmanyofthesestudies,thetreatmentandcontrolgroups
wereeffectivelypooledanditcanbedifficulttoidentifytheextenttowhichthe
parametersareidentifiedfromexperimentalvs.non-experimentalvariation.
AnothersenseinwhichtheIMEdivergedfrommuchmodernsocial
experimentalpracticewasinthesourceofoutcomemeasures.Themainoutcome
5Indeed,in1990–sevenyearsafterthefinalexperimentalreportfromthefollow-upSeattle-DenverIncomeMaintenanceExperiment,andaftermanypublishedanalysesofthedata–AshenfelterandPlant(1990)areapparentlythefirsttoreporttheresultsassimplemeansbyrandomlyassignedtreatmentgroup.
14
measuresfortheIMEanalyseswerepaymentsundertheIMEandlaborsupply
measuresdrawnfromparticipants’self-reportsaspartoftheprogram’s
administration.Butasinotherexperiments,manysubjectsfailedtocompletethe
follow-upsurveys.Unfortunately,thedesignoftheIMEprogrammeantthatthe
privatereturnstocontinuedreportingvarieddramaticallywithbothtreatment
statusandendogenousoutcomes,astheincomemaintenancepaymentsweremade
onthebasisofthesereports.Differentialattritionmadetheresultsquitedifficultto
interpret(AshenfelterandPlant1990).
InthewakeoftheIncomeMaintenanceExperiments,thefieldexploded.
Greenberg,Shroder,andOnstott(1999;seealsoGreenbergandShroder2004)
identified21socialexperimentsbetween1962and1974,largelyineducationand
health.Bycontrast,theyidentify52between1975and1982and70between1983
and1996,andmostofthesearedirectlyrelatedtothelabormarket.(Therehasnot
beenassystematicacensusofpost-1996experiments,butthepaceoflargescale
labormarketexperimentsseemstohavedroppedoffsincethen,atleastinthe
UnitedStates.Therehasbeenrapidgrowthofsocialexperimentsineducationover
thisperiod,however.)Greenbergetal.(1999;hereafterGSO)highlightimportant
changesinthepost-1975experiments.IncontrasttotheIME,mostinvolvedonly
oneortwotreatmentarmsplusacontrol,andweredesignedmoreas“blackbox”
evaluationsoftheprogramsencapsulatedinthetreatments–oftenmodificationson
existingprograms(Gueron,thisvolume)–thanaseffortstomapoutaresponse
surface.
15
GSOemphasizethatthevastmajorityoftheexperimentstheyidentified
focusedonlow-incomepopulations,afactthatdoesnotseemtohavechangedsince
theirsurvey.Severaltopicsstandoutascentral:
- Humancapitaldevelopment.Overone-thirdofthestudiesinGSO’ssample
includeatleastonetreatmentarminvolvingasupportedwork
experience,on-the-jobtraining,vocationaleducationortraining,orbasic
education(includingGEDprograms).
- Laborsupply.Anumberofexperimentshaveinvolvedinterventions
aimedatincreasinglaborsupply,includingtheincomemaintenance
experiments,studiesofre-employmentbonusesforunemployment
insurancerecipients,andabroadgroupofwelfare-to-workexperiments
conductedaspartofthemid-1990swelfarereformmovement.
- Jobsearchassistance.Anothercommoncategoryofexperimentsexamines
interventionsaimedatmakingdisadvantagedworkers’jobsearchefforts
moreeffective,throughcounseling,jobclubs,orjobplacementservices.
Thesearenotmutuallyexclusive.Inparticular,anumberofprogramsand
experimentscombinedjobsearchassistancewitheitherjobtrainingorincentivesto
findwork.
b. Socialexperimentsasatoolforprogramevaluation
Randomassignmentsolvestheselectionproblemthatoftenplaguesnon-
experimentalprogramevaluations,andmakesitpossibletogenerateuniquely
credibleevidenceontheeffectsofwell-defined,successfullyimplemented
16
programs.Intheabsenceofrandomassignment,peoplewhoparticipateina
program(thosewhoare“treated”)arelikelytodifferinobservedandunobserved
waysfromthosewhodonotparticipate,andtheeffectofthisselectioncanbe
distinguishedfromthecausaleffectoftheprogramonlyviatheimpositionof
unverifiableassumptionsabouttheselectionprocess.Thisisaveryimportant
advantageoftheexperimentalparadigmoverotherresearchmethodologies(so-
called“observational”comparisons),andwedonotintendtominimizeits
contributionstothefieldofeconomics,publicpolicy,andbeyond.
Butexperimentshavelimitationsaswell–whiletheycanhaveveryhigh
internalvalidity,atcloserinspectionthisistrueonlyforcertaintypesofprograms
andcertaintypesofoutcomes;andeventhentherecanbeotherchallenges,suchas
difficultiesingeneralizingfromtheexperimentalresultstoabroadersetting.
Inthissubsection,wediscussthevalueofexperimentsasameansofsolving
theselectionproblem.Wethendiscusssomeofthelimitationsoftheexperimental
paradigmforprogramevaluationandpolicyanalysis.Ourdiscussiondrawsheavily
ontheAngrist-Imbens-Rubin(1996)“potentialoutcomes”framework.Someofthe
limitationswediscusscanbeaddressedviacarefuldesignoftheexperimental
study,whileothersrequireaugmentingexperimentalmethodswithothertools.We
takeupthesetopicsinSectionIV.
i. Thebenchmarkcase:Experimentswithperfectcompliance
Theappealofrandomizedexperimentsisthattheymaketransparentthe
assumptionsthatpermitcausalinferenceandcreateadirectlinkbetweenthe
implementationoftheexperimentandthekeyselectionassumption.Thesimple
17
contrastbetweenthoserandomlyassignedtoparticipateintheprogramandthose
randomlyexcludedidentifiestheeffectofbeingassignedtoparticipate,subjectonly
totheassumptionthattherandomizationwasconductedcorrectly.Moreover,in
manycasesthiseffectisidenticaltotheeffectoftheprogramonitsparticipants
(knownasthe“effectofthetreatmentonthetreated,”orTOT),whichisoftenthe
mainparameterofinterest;inothercases,itisstraightforwardtoconverttheeffect
ofassignmenttoparticipate(oftenknownasthe“intentiontotreat,orITT,effect)
intoanestimateoftheprogramtreatmenteffectforasubpopulationofinterest.
Theseresultsarewellknown(see,e.g.,AtheyandImbens,thisvolume),and
wedonotreviewthematlengthhere.Butitwillbeusefultohavenotationlater.We
useDonaldRubin’spotentialoutcomesframeworkforcausalinferenceassetforth
inHolland(1986).Weconsidertheevaluationofasimple,well-definedprogram,
suchasanin-classjobtrainingcourseorabonusschemetoencouragerapidreturn
toworkafterajobdisplacement,whereitispossibletoassignindividuals
separatelytoparticipateortobeexcludedfromparticipationintheprogram.6For
eachindividuali,onecanimaginetwopossibleoutcomes:Onethatwouldobtainifi
participatedintheprogram,yi1,andonethatwouldobtainifheorshedidnot
participate,yi0.7Theprogram’scausaleffectonpersoniissimplythedifference
betweentheoutcomewhichwouldobtainifhe/sheparticipatedandthatwhich
6Inthecaseofthebonusscheme,the“treatment”iseligibilityforthebonus,notactualreceipt.7Thisnotationrestsonanassumptionaboutthemechanismsbywhichtheprogramoperates,knownasthe“stableunittreatmentvalueassumption,”or“SUTVA.”WediscussSUTVAatgreaterlengthbelow.
18
wouldobtainifshedidnot,τi=yi1-yi0.Whenτi>0,iwouldhaveahigheroutcomeif
he/sheparticipatedthanifhe/shedidnot;whenτi<0,theoppositeistrue.
LetDibeanindicatorforparticipation,withDi=1ifiactuallyparticipatesin
theprogramandDi=0ifidoesnot.Thesimplestestimatoroftheprogram’seffectis
thecontrastbetweentheaverageoutcomesofthosewhoparticipateandthosewho
donot.Thiscanbewrittenas:
E[yi|Di=1]–E[yi|Di=0]=E[yi1|Di=1]–E[yi0|Di=0]
=E[τi|Di=1]+(E[yi0|Di=1]-E[yi0|Di=0]).
Thus,thesimpleparticipant-nonparticipantcontrastcombinestwodistinct
components:Theeffectofthetreatmentonthetreated,τTOT=E[τi|Di=1],anda
selectionterm,E[yi0|Di=1]-E[yi0|Di=0],thatcapturesthedifferenceinoutcomes
thatwouldhavebeenobservedbetweenthosewhoparticipatedintheprogramand
thosewhodidnot,hadneithergroupparticipated(forexample,hadtheprogram
notexisted).Thissecondtermarisesbecausetheprocessbywhichpeopleselect(or
areselected)intoprogramparticipationmaygeneratedifferencesbetween
participantsandnon-participantsotherthantheirparticipationstatuses.Ifso,the
treatment-controldifferencecannotbeinterpretedasanestimateoftheeffectofthe
program.
Inasimplesocialexperiment,Diisrandomlyassigned.Thisensuresthatthe
distributionsofyi0andτiareeachthesameforthosewithDi=0asforthosewithDi
=1.Thefirstimpliesthattheselectiontermiszero;thesecond,thattheTOTeffect
equalstheaveragetreatmenteffect(ATE),E[τi],inthepopulationrepresentedby
19
thestudysample.Thus,theaveragecausaleffectisidentified,notjustinthetreated
subgroupbutinthelargerpopulation.8
This,inanutshell,isthevalueofrandomizationinprogramevaluation.Ina
simplerandomizedcontroltrial,theidentificationassumptionthatjustifiescausal
inferenceissimplythattherandomizationwascorrectlyexecuted.Ofcourse,inany
finitesampletheremaybedifferencesinthesampleaveragesofy0iorτibetween
treatmentandcontrolgroups.Butthisvariationiscapturedbythestandarderrorof
theexperimentalestimate.Theestimateisunbiased,withmeasurableuncertainty,
solongasthegroupsarethesameinexpectation.
ii. Imperfectcomplianceandthelocalaveragetreatmenteffect
Acomplicationthatoftenarises,andthatwillbecentraltosomeofour
discussionbelow,isthatitisnotalwayspossibletocontrolsubjects’program
participation.Somesubjectswhoareassignedtoreceivejobtrainingmaynotshow
uptotheircourse,whileotherswhoareassignedtothecontrolgroup,andthusnot
toreceivetraining,mayfindanotherwayintotheprogram.Thiscanbeformalized
byintroducinganadditionalvariable,Zi,representingtheexperimenter’sintention
forindividuali:AnindividualwithZi=1isintendedtobeserved,andonewithZi=
0isnottobe.ZiisrelatedtoDi,butimperfectly:Some(non-randomlyselected)
individualswhoareassignedZi=1willwindupwithDi=0(e.g.,thosewhofailto
8Thisholdsiftheentirepopulationofinterestispartoftheexperiment.Ifthestudysampleisnotrepresentativeofthebroaderpopulation,theATEidentifiedwillbelocaltothesubpopulationrepresentedbythesample.
20
arrivefortheirassignedtrainingcourse),andotherswhoareassignedZi=0will
windupwithDi=1,(e.g.,thosewhotalktheirwaypasttheprogramscreener).
Withpartialcompliance,theexperimentidentifiesneithertheaverage
treatmenteffect(ATE)northeaverageeffectofthetreatmentonthetreated(TOT).
Rather,thebestthatcanbeidentifiedisthelocalaveragetreatmenteffect,orLATE,
forthesubgroupofexperimentalsubjectswhocomplywiththeirexperimental
assignment.Specifically,letDi0representtheindividual’streatmentstatusif
assignedZi=0andDi1representthetreatmentstatusifassignedZi=1.The
“complier”subpopulationisdefinedasthosewithDi0=0andDi1=1–thosewho
receivethetreatmentifandonlyiftheyareassignedtoreceiveit.Thecontrast
betweentheaverageoutcomesofthoseassignedtoreceiveandnottoreceive
treatmentisthen:
E[yi|Zi=1]–E[yi|Zi=0]=Pr{Di0=0,Di1=1}*E[τi|Di0=0,Di1=1].9
Thisisknownasthe“intentiontotreat”(ITT)effect.Thefirsttermisthecomplier
shareoftheexperimentalpopulation;thesecondisthelocalaveragetreatment
effect(LATE)forcompliers.
Inmanycases,theITTistheeffectofprimaryinterest.Itrepresentsthe
actualeffectofofferingaccesstotheprograminthesettinginwhichtheexperiment
takesplace.Often,itisonlypossibletomanipulatetheoptiontoparticipate
(consider,forexample,theofferofjobtraining–onecanneverforceindividualsto
9Weassumehere,asinnearlyallanalysesofexperimentswithpartialcompliance,thatthereareno“defiers”whoreceivethetreatmentifandonlyiftheyareassignednottoreceiveit(Di0=1andDi1=0).
21
participateinatrainingprogram),sotheeffectofmanipulatingthisofferisthekey
parameterforevaluationoftheprogramsunderconsideration.
Inothercases,however,onemightwanttoidentifytheeffectofprogram
participation(asdistinctfromtheoffertoparticipate).OnecanrecovertheLATEfor
compliersbydividingtheITTbythecompliershare,whichcanbeidentifiedasE[Di
|Zi=1]–E[Di|Zi=0];equivalently,theLATEcanberecoveredfromaninstrumental
variablesregressionusingZiasaninstrumentforDi.
TheLATEmaydifferfromtheATEorevenfromtheTOT.Forexample,in
manysettingsonewouldexpectthatpeoplewhowillreceivethelargestbenefits
fromtreatmenttomakedisproportionateeffortstoobtainit,evenifassignedtothe
controlgroup;inthiscase,theTOTwillexceedtheLATE.Unfortunately,the
compliersarenotalwaysthepopulationofprimaryinterest.Furtherstructure,or
successfulrandomizationofDiitself,isrequiredtoidentifytheATEorTOT.
c. Limitationsoftheexperimentalparadigm
Thebasicexperimentalparadigmisinvaluableforitsabilitytoresolvethe
fundamentalproblemofcausalinference,byensuringthatestimatedprogram
effectsarenotconfoundedbyselectionintotreatment.Butitcannotsolveall
identificationproblemsfacedbyprogramevaluators,noranswerallquestions
posedbylaboreconomistsseekingtounderstandtheworkingsofthelabormarket.
Intheremainderofthissection,wewillbrieflyintroducesix(partiallyoverlapping)
designissuesthatcommonlyariseinlabormarketexperiments.Ineachcase,
identifyingtheeffectsofinterestmayrequiremovingbeyondthetreatment-control
22
contrastinoutcomesfromasimplerandomizedexperiment.Wediscusseachin
moredetailinSectionIV,wherewealsodiscusspotentialsolutionstoeach.
i. SpilloverEffectsandtheStableUnitTreatmentValueAssumption
Theabovebriefoverviewoftheeconometricsofexperimentsglossesoveran
importantassumption,knownasthe“stableunittreatmentvalueassumption,”or
SUTVA(Angrist,Imbens,andRubin1996;AtheyandImbens,thisvolume).
Intuitively,thisassumptionstatesthattheoutcomeofindividualiisunaffectedby
thetreatmentstatusofeachoftheotherstudyparticipants.Withoutthis
assumption,eachindividualhasnottwobut2Npotentialoutcomes,makinganalysis
intractable.Formanyprogramevaluations,SUTVAisinnocuous.Butinothercases
itcanbequiterestrictive.Forexample,theprovisionofjobsearchassistanceto
someindividualsmaycreate“congestion”inthelabormarket,reducingthejob-
findingratesofothersparticipatinginthatmarket.ThisisaviolationofSUTVA,and
willleadasimplerandomizedtrialtooverstatethetotaleffectofjobsearch
assistance.AnotherpotentialviolationofSUTVAoccursifmembersofthetreatment
groupinteractwitheachotherorwiththecontrolgroupinawaythatdilutesthe
treatmentdifferencebetweenthem–forexample,ifthetreatmentinvolves
informationprovisionbuttreatedindividualspassthatinformationontothe
controls.
ii. Endogenouslyobservedoutcomes
Inmanylabormarketexperiments,someoutcomesofinterestareobserved
onlyforasubsetofindividuals.Forexample,weeklyhoursofwork(laborsupply),
23
hourlywages,jobcharacteristics,careeradvancement,andretentiononthejobare
observedonlyforthosewhoareabletofindjobs,notforthosewhoare
unemployed.Evenidealexperimentswithperfectcompliancemaynotidentifythe
causaleffectsofinterestontheseoutcomes.
iii. SiteandGroupEffects
Anotherlargeclassoflimitationsinexperimentshastodowithgeneralizing
beyondtheexperimentalsample.Extrapolationstootherprograms,othersamples,
orothertreatmentregimescanbehazardous.Wewilldiscussinthispaperthree
broadclassesofexternalvalidityissues.
Oneclasshastodowithvariationsinthetreatmentonofferacrossprogram
locations.Inmanyprograms,thetreatmentisnothomogeneousacrosslocations;in
othercases,thetreatmentmaybehomogeneousbutoutcomedistributionsvary.In
eithercase,onemightbeinterestedinidentifyinghowtreatmenteffectsvaryacross
locations.
Thesecondclassderivesfromobserveddifferencesbetweenthepopulation
ofinterestandthatincludedintheexperimentalsample–onemightwantto
understandaprogram’seffectonapopulationthatdiffersinobservablewaysfrom
thatrepresentedintheexperimentalsample,oronasubpopulationotherthanthe
experimentalcompliers.
iv. TreatmentEffectHeterogeneityandExternalValidity
Thethirdclassofexternalvalidityissuesarisesfromunobserveddifferences
inindividualtreatmenteffects–whentheeffectofthetreatmentvariesacross
24
individualsinwaysthatarenotcapturedbyobservedparticipantcharacteristics,
andwhentheparametersofinterestextendbeyondtheaveragetreatmenteffectin
thepopulationfromwhichtheexperimentalsampleisdrawn.Thiscanoccurwhen,
forexample,theexperimentalcompliershareisnotexpectedtomatchthetake-up
ratewhentheprogramisofferedmoregenerally,orwhenoneexpectstoofferthe
programtoapopulationthatmaydifferinitstreatmenteffectdistributionfromthe
experimentalpopulation.Whileconceptuallysimilartodifferencesalongobserved
characteristics,theeconometricsbehindaddressingunobserveddifferencesin
treatmenteffectsissufficientlycomplexandself-containedthatwediscussit
separately.
v. HiddenTreatments
Interpretingestimatedprogrameffectsandextrapolatingtoothersettings
canbecomplexeveninthecaseofuniformtreatmentsanduniformpopulations.For
example,ifnon-compliershaveaccesstoalternativestotheprogramunderstudy
(e.g.,tocoursesofferedbyalternativejobtrainingproviders),thiswillleadto
variationintreatmenteffectsevenwithouttreatmenteffectheterogeneityornon-
complianceintreatmentassignmentinthestandardsense.Thealternative
treatmentsareoften“hidden,”asadministrativedataontheprogramunderstudy
willnotrevealwhetherparticipantshavereceivedalternativeselsewhere.Inthis
case,theexperimentalimpactidentifiesthetreatment’seffectrelativetoapoorly
specifiedalternativethatmaynotdifferdramatically,andmaybeapoorguideto
theprogram’svaluerelativetonotreatment.Inmulti-sitestudies,differentialtake-
upofsuchhiddentreatmentsbythecontrolgroupmaycreatetheappearanceof
25
treatmenteffectheterogeneityacrosssitesandhinderextrapolationtoother
settings.
vi. MechanismsandMultipleTreatments
Inmanyinstances,weareinterestedinunderstandingthemechanism
generatingaparticulartreatmenteffect.Insomecases,theeffectsofseparate
mechanismsareofinherentinterest.Incomplexexperimentswithmultiple
treatments,itisimportanttounderstandwhichtreatmentswereparticularly
effective,andwhy.Forexample,manyjobtrainingprogramsincludejobsearch
assistance,andviceversa.Inothercases,understandingthemechanismsiscrucial
inextrapolatingfromtheparticularexperimentalsettingtoothersituations.For
example,intheCanadianSelf-SufficiencyProgram(SSP)workershavetofirst
establisheligibilitytothenparticipateawagesubsidyprogram,creating
endogenousselectionthatmakesitdifficulttointerprethowthesubsidyprogram
affectslaborsupply(CardandHyslop2005).Withoutadditionalinformationor
additionalstructure,multiplemechanismsarenotseparatelyidentified,leadingto
potentialseriouslimitationsinunderstandingoftheprogramandinexternal
validity.
d. Quasi-experimentalandStructuralResearchDesigns
Itisnotalwayspossibletouseatruerandomizedexperimenttoevaluatea
programormechanismofinterest,duetooperational,financial,orethical
constraints.Quasi-experimentalstudiesrelyonaspectsoftheprogramorpolicy
variationasasourceofplausiblyas-good-as-randomvariationintreatment
26
assignment–examplesincluderegressiondiscontinuitydesigns,regressionkink
designs,anddifference-in-differences(seeAngristandKrueger1999).Thesecanbe
usefulalternativeswhentrueexperimentsareinfeasibleorsimplynotavailable.
Whenthequasi-experimentalvariationisasgoodasrandomlyassigned,thevarious
quasi-experimentaldesignscanrecovertreatmenteffectsjustascanexperiments.
Buteveniftheassumptionsgoverningassignmentarecorrect,quasi-
experimentaldesignsgenerallysolveonlytheassignmentproblem,anddonot
necessarilyaddresstheadditionalissuesdiscussedabove.Thesameistruefor
selection-on-observablesestimators(e.g.,matchingestimators):The
“unconfoundedness”assumptioneliminatestheselectionproblem,ifitholds,but
doesnothingtoaddressotherdesignissues.
Incontrast,structuralapproachesthatexplicitlyspecifyallaspectsofthe
choiceproblemandresultingoutcomescaninprincipleresolvebothassignment
andotherdesignissuessimultaneously.However,thisapproachhingesonthe
modelbeingcorrectlyspecified,andhencemaycomeatasubstantialcostto
internalvalidity.
III. Amorethoroughoverviewoflabormarketsocialexperiments
ItisnoaccidentthatwediscussdesignissuesofRCTsinthecontextofsocial
experimentsinthelabormarket,sincemanyofthemajordesignissuesdiscussedin
SectionIIariseintheevaluationofimportantlabormarketprograms.Inthissection
wereviewsomeofthemaincharacteristicsofexistingsocialexperimentsinlabor
economicsinlightofthesedesignissues.Wedistinguishthreebroadsubstantive
27
topicsthathavebeenstudiedextensivelyviasocialexperiments:Laborsupply,
particularlyoflow-incomefamilies,welfarerecipients,andunemployment
insurancerecipients;jobtrainingandskilldevelopment;andjobsearch.Inthis
Section,wediscusseachinturn.Foramoredetaileddiscussionoftheexperiments
wementionhere,wereferthereadertooursummarytables,andexcellent
overviewsprovidedelsewhere.10
a. LaborSupplyExperiments
Onecanbroadlycategorizesocialexperimentsprovidingincentivesto
increaselaborsupplyintothreegroups,followingtheirprogramstructure,target
group,andtimeperiod:TheIncomeMaintenanceExperimentsinthelate1960sand
early1970s;welfarereformexperimentsinthelate1980sthroughthemid-1990s;
andreemploymentsubsidyexperiments,whichspanalongertimeperiod.
TheIncomeMaintenanceExperiments
AfirstwaveofexperimentsweretheIncomeMaintenanceExperiments
(IME)alreadydiscussedinSectionII,whichtreatedlow-incomehouseholdswith
variouscombinationsoflump-sumtransfersandtaxesonearnings.Byrandomly
assigningtreatmentandcontrolgroupstomultipletreatmentarmswithvarying
combinationoftaxratesandsubsidies,andbyseparatelytargetinggroupsof
differentincomelevels,theexperimentsallowedtracingoutlaborsupplyresponses
10SeeamongothersGreenbergandShroder(2004),Heckman,Lalonde,andSmith(1999),Meyer(1995).OuroverviewfocusesalmostexclusivelyonU.S.experiments.Foranoverviewofactivelabormarketpolicyevaluations,drawinglargelyonEuropeanevidence,seeCard,Kluve,andWeber(2010).
28
indifferentpartsofthebudgetconstraintandundervaryingfinancialconditions.
Therewerefoursuchexperiments,initiatedbetween1968and1971,inNewJersey,
Seattle-Denver,Gary(IN),andinruralareas.Table1providesdetailedinformation
abouttheseexperiments.Whilethesamplesizesweremoderatebylaterstandards,
thetotalcostwassubstantialcomparedtomostrandomizedevaluationoflabor
supplyincentivesthatwouldfollow.Thisisinimportantpartbecausetheprogram–
thepaymentsthemselves–wasexpensiveonaper-participantbasis.Complex,
stratifiedexperimentaldesignswereusedineffortstominimizethesecosts,but
evenwiththesethestudiesweremajorinvestments.
Acrosseachoftheincomemaintenancestudiesandvariouscomparison
groups(e.g.,husbands,wives,andsinglefemalehouseholdheads),laborsupply
resultswerefairlyconsistent:Thecombinationofalump-sumtransferanda
positivetaxratereducedparticipants’earnings(i.e.,laborsupply),bymoresowhen
thetransferandtaxratewerelarger.Thisreflectsacombinationofincomeand
substitutioneffects;Robins(1985)combinesthevariousstudiesandusescontrasts
amongthedifferenttreatmentarmstoseparatelyidentifytheincomeand
substitutionelasticitiesoflaborsupply.Heconcludesthattheseelasticitieswere
fairlystableacrossstudies,butfairlysmall:Thesubstitutionelasticitywasunder0.1
forhusbands,justabove0.1forsinglefemaleheads,andmorevariablebut
averaging0.17forwives.Incomeelasticitieswerelessconsistent,butcentered
around-0.1.
Inretrospect,theseexperimentsencounteredanumberofthedesignissues
thatweidentifiedinSectionIIanddiscussatgreaterlengthbelow.Forexample,
29
becauseofthehighattritionrates,whichasAshenfelterandPlant(1990)notewere
differentialacrosstreatmentgroups,theyalsocanbeseenasanexampleofthe
endogenouslyobservedoutcomesproblem.Similarly,withoutadditional
assumptionsitisimpossibletoestimatetheeffectoftheseprogramsonhours
workedorwages.Interestingly,incontrasttomostrandomizedevaluationsthat
followed,theywereprimarilyfocusedonidentifyingthemechanisms–incomevs.
substitutioneffects–behindanylaborsupplyresponses,ratherthanthesimple
treatmenteffectofanexistingprogram.Thismotivatedtheuseofalargenumberof
treatmentarms,anoptionwediscussbelowasonewayofaddressingquestions
aboutmechanisms.
WelfareReformExperiments
Asecondwaveofsocialexperimentsrelatedtolaborsupplywasinitiated
betweenthelate-1980sandthemid-1990s,andevaluatedtheeffectofemployment
incentivesforwelfarerecipients.WhiletheIMEexperimentswerefundedalmost
exclusivelybythefederalgovernment,theselaterevaluationsconcernedstate-level
programsandwerefundedmostlyatthestatelevel.11Incontrasttotherelatively
straightforwardstructureofthenegativeincometaxtreatments,thesewereusually
randomizedevaluationsofentire,complexprograms,oftendesignedas
replacementsfortraditionalAFDC,thatincludedcomponentsdesignedto
strengthenworkincentivesalongwithothers(e.g.,childcareorjobsearch
assistance)designedtoreducebarrierstowork.
11Foradetailedhistoricalaccount,seethechapterbyJudithGueroninthisvolume.
30
WehaveidentifiedwelfareRCTsinatleast13states.Table1includesa
selectionoffoursocialexperimentsonthistopic,implementedinCalifornia,
Connecticut,Florida,andMinnesota,thoughthereweremanymorenotlistedhere.
Acommoncomponenttomostnewprograms(experimentaltreatments)wasthe
introductionoflifetimetime-limitsofwelfarereceiptandincreasesinearnings
disregards,botheventualcomponentsofthe1996federalwelfarereform–priorto
thisreform,implementationofsuchchangesrequiredawaiverfromtheU.S.
DepartmentofHealthandHumanServices,andthiswasoftenconditionedonan
experimentalevaluation.Theexactnatureofboththenewprogramsandthe
traditionalwelfarebenefitvariedbystate.Otherprogramfeaturesvariedwidelyas
well,includingjobsearchassistance,accesstochildcare,changesincase
management,andprovisionofjobtraining.
TwoexamplestowhichwewillrefertolaterareConnecticut’sJobsFirstand
Florida’sFamilyTransitionProgram.Inbothcases,controlgroupmembersfaceda
welfarebenefitschedulethathadnotimelimitsandhighimplicittaxeson
working.12JobsFirstandtheFamilyTransitionProgrameachintroducedtimelimits
forwelfarereceiptandbenefitscheduleswithlowerimplicittaxrates.UnderJobs
First,eligiblewelfarerecipientssawnoreductionintheirbenefitswhileworking
untilearningshitthefederalpovertyline.UndertheFamilyTransitionProgram,a
workingwelfarerecipientcouldkeep$200amonth,plus50%ofallearningsabove
12InConnecticut,welfarerecipientswereeligibleforafixedearningsdisregardof$120forthetwelvemonthsfollowingthefirstmonthofemploymentwhileonassistanceand$90afterwards.Recipientswerealsoeligibleforaproportionaldisregardofearningsabove$120($90):51%forthefourmonthsfollowingthefirstmonthofemploymentand27%afterwards.InFlorida,afterthefirstfourmonthsofwork,themarginaltaxrateonearningsforAFDCrecipientswas100%iftheyearnedover$90permonth.
31
$200.Bothprogramsalsomodifiedotherwelfareprogramfeatures,including
enhancedenforcementofworkrequirements,changingthedurationofaccessto
Medicaidbenefits,settingassetlimitsforwelfarereceipt,andprovidingchildcare
assistance,amongothers.
Therandomizedevaluationofthetwoprogramscapturedthecombined
effectsofallofthesechangesonemploymentandearnings.Eachprogramledto
higherearningsandhighertotalincomes,inclusiveofwelfarepayments,inthe
treatmentgroup,thoughineachcasethiseffectdiminishedovertime.Total
governmentalcostswerehigherfortheConnecticuttreatmentgroupthanfor
controls,butthereversewastrueinFlorida.Animportantcaveatisthatthese
resultslargelyreflecttheperiodbeforetimelimitsbound.
Inmanyofthewelfare-to-workexperiments,keyoutcomesofinterest
includedhoursofworkamongthosewhoareemployedandwagesorearnings.
Neitheroftheseisobservedforthosewhoarenotemployed.Thus,althoughmany
studiesreportexperimentaleffectsonendogenouslyobservedoutcomes,theseare
understoodtosufferfromseriousselectionproblems.Anotherissuetotakeinto
accountininterpretingtheseexperimentsisthepossibilityofspillovereffects.
Theseweretypicallynotsmallpilotstudiesbutinvolvedbroadchangestowelfare
rules,sometimesappliedtoallprogramparticipantsexceptforahold-outcontrol
group.
Anothermajorquestionregardingwelfare-to-workprogramsconcerns
heterogeneityintreatmenteffects.Onemightimaginethatthereisasubpopulation
ofrecipientswhoareresponsivetoworkincentivesandanothergroupofhardcases
32
whoaremuchlessresponsive.Theaveragetreatmenteffectsthatcanbeestimated
fromtheseexperimentsmightsubstantiallyoverstatetheemployabilityofthelatter
participants.
ReemploymentSubsidyExperiments
Athirdbroadgroupoflaborsupply-relatedexperimentsevaluateddirect
reemploymentsubsidies.Onesetofsuchprogramshadincentivesstructuredlikea
negativeincometaxandweretargetedtowelfarerecipientsorlow-income
individuals,sometimesaspartofthesameAFDCreformsdiscussedabove.These
tookplacemostlyinthemid-tolate-1990s,andincludedtheCanadianSelf-
SufficiencyProgram(SSP),Minnesota’sFamilyInvestmentProgram(FIP),and
Wisconsin’sNewHopeProject.TheseRCTscanbeseenasevaluationsofwelfare-
likeprograms,butincludedsubsidiesthatwereconditionalonsustainingacertain
amountofemployment.Notsurprisingly,theseprogramsgenerallyledtoincreased
earningsamongtreatmentgroupparticipants(thoughFIPwasanexception);
differentstudiesvariedinwhethertheadditionalincomeofparticipantswaslarger
orsmallerthantheextrawelfarecostsbornebythegovernment.
Anothersetofsuchprogramswereschemesthatpaidlump-sumsubsidies
conditionalonemployment–effectively,bonusesforfindingwork.Theseinclude
thewell-knownreemploymentbonusexperimentstargetedatunemployedworkers
receivingunemploymentinsuranceinIllinois,Pennsylvania,andWashingtonState
inthemid-1980s.Thesestudiesfoundthateligibilityforarelativelylarge
reemploymentbonusledtoshorterunemploymentinsurancespells,withno
33
detectableimpactonthequalityofthejobobtained,butthattheeffectswere
relativelysmallandthustheprogramswerenotcosteffective.
Morerecently,abonusforwelfarerecipientswhofoundajobandwho
remainedreemployedforacertaintimewasevaluatedinthecontextofTexas’
EmploymentRetentionandAdvancement(ERA)projectintheearly2000s(Dorsett
etal.2013).TheTexasevaluationwaspartofalarge-scalerandomizedevaluation
of12differentservicecombinationsindifferentU.S.citiesfrom2000to2004under
theERAprojectumbrella(HamiltonandScrivener2012).ThemainfocusofERA
wastoexpandworkforceservicestorecentlyreemployedwelfarerecipientsorlow-
wageworkerstomaintainsuccessfullaborforceattachment(thoughthreesites,
includingTexas,combinedpre-andpost-employmentassistance).Theevaluation
testedabroadrangeofservices,withatbestmixedresultsregardingtheeffectof
post-employmentservicestested.
Animportantfeatureofseveraloftheseemploymentsubsidyprogramswas
thatpotentialrecipientshadtobecomeeligibleforthesubsidy,usuallybyworkinga
minimumamountofhours.Hence,whilethemaingoaloftheprogramswastohelp
workersbuildattachmenttothelaborforce,effectsofthesubsidy(asdistinctfrom
thesubsidyoffer)onthedurationofemploymentcouldbeestimatedonlyforthose
whofoundjobsinthefirstplace,asubsamplethatwasdifferentiallyselectedinthe
treatmentandcontrolgroups.CardandHyslop(2005)
refertothisasan‘eligibilityeffect’;inourearliertaxonomyofdesign
challenges,thiscanbeseenasacasewherethemechanismsunderlyingthe
34
treatmenteffectareofprimaryinterest.Underanyname,itcomplicatesthe
interpretationoftheoutcomesofasimpleRCT.
Overall,randomizedstudiesofarangeoflaborsupplyincentiveprograms
havefoundlaborsupplyresponsestochangesinimplicitorexplicitfinancial
incentivesaspredictedbytheory.However,abroadthemeemergesthat
employmenteffectshavemostlybeenshort-lived,andeffectsontotalparticipant
incomeinconsistent.Achallengeininterpretingthesestudieshasbeenthattypically
anumberoftreatmentswerevariedsimultaneously,includingimplicittaxratesand
lump-sumtransfers,trainingprograms,jobsearchassistance,enforcementand/or
timelimits.Hence,extrapolatingfromthesefindingstonewprogramsproviding
differentcombinationsoftreatmentsisdifficultwithoutunderstandingthe
underlyingbehavioralresponses,whichtypicallyrequiresadditionalassumptions.
b. Trainingexperiments
From1964totoday,wecountover50RCTsthatevaluatejobtraining
programsofvariousforms.Theseincludelarge-scaleevaluationsconductedatthe
nationallevel,state-levelevaluations,andevaluationsofprogramsatthelocallevel.
Theprogramsevaluatedvariedsubstantiallyinthetypeoftraining,whichranged
fromvocationalandgeneralclassroombasedtrainingofdifferentdurationstoon-
the-jobtrainingbyactualemployers.Mosttrainingprogramswerecomplemented
bysomekindofjobsearchassistance,butinthestudieswereviewherethiswasnot
theemphasis.Table2providesanoverviewofaselectedgroupoftheseRCTs.
35
Trainingprogramsarelesseasilyclassifiedthanlaborsupplyprograms.
Whilethefirstjobtrainingsocialexperimentofwhichweareawarefocusedonlaid
offworkers(theGeneralEducationinManpowerTrainingexperiment,begunin
1964),thevastmajorityoftrainingprogramsaretargetedtowelfarerecipients,to
low-incomeindividualsgenerally,ortolow-incomeyouth.Moreover,whileonecan
broadlydistinguishphasesofexperimentalevaluationparalleltothepatternsinthe
evaluationofwelfareprogramsoutlinedabove,randomizedevaluationsoftraining
programsoccurredmoreevenlyfromthe1980stotoday.Itisalsohardertodiscern
commonpatternsinthetypesoftrainingprovidedorprogramsevaluated.
Thefirstlarge-scaleevaluationofamixofon-the-jobexperienceand
supervisionforhard-to-employindividualswastheNationalSupportedWork
Demonstration(NSWD),whichranfrom1975to1980.TheNSWDwasalargeand
expensivesocialexperimentimplementedbytheU.S.atthenationallevel,butdid
notevaluateanestablishedtrainingprogram.Rather,theNSWDreliedonlocalnon-
profitstoorganizeaprograminwhichtreatmentparticipantswereplacedinteams
ofupto10participantsworkingunderaforeman,whoalsoservedasacounselor
andlaterprovidedjobsearchassistance,onsmall-scaleprojects,typicallyin
construction,lightmanufacturing,orsocialserviceprovision.Participantsreceived
asmuchasoneyearofworkexperience,underconditionsofincreasingdemands,
closesupervision,andworkinassociationwithacrewofpeers.Thestudytargeted
fourgroupsofworkers:womenthathadbeenonAFDCforatleast30months;ex-
addicts;exoffenders;andyounghigh-schooldropouts.Ittookplaceat10sites,and
36
ateachsitesenrolleeswereselectedrandomlyfromagroupofvolunteers.13
ParticipationhadlargepositiveeffectsonAFDCrecipientsandsmallerpositive
effectsonex-addicts,butbenefitsforothergroupsweresmallerandgenerally
statisticallyinsignificant.
ThedatausedtoevaluateNSWDcamefromaseriesoffollow-upsurveys.14
Attritionwasanissuehere:After27months,only72%(68%)ofthetreatment
(control)groupsoftheNSWDcompletedinterviews.AsintheNITstudies,thiscan
beseenasavariantoftheendogenouslyobservedoutcomesproblem.
TheNSWDstudywasfollowedbyarangeofevaluationsofstate-level
programsintheearly-tomid-1980s.Theseweretargetedalmostexclusivelyat
welfarerecipients,andlargelyfinancedbythefederalgovernment.These
evaluationscontinued,withgreaterinvolvementofstategovernments,throughthe
late1980sandmid-1990s.WhilemanyoftheseRCTswererelativelysmall,some
weresubstantial.ExamplesincludetheCaliforniaGAINandOhioJOBSprogram
evaluations,beginningin1988and1989,respectively.Detailedcharacteristicsof
someoftheseevaluationsareshowninTable2.TheCaliforniaprogram,whichwas
mandatoryforwelfarerecipients,includedjobsearchassistance,basiceducation,
andskillstraining.Ithadlargepositiveeffectsonearningsandnegativeeffectson
welfarereceipt,particularlyforsingleparents.EffectswerelargestinRiverside
County,whereadministratorsemphasizedjobplacementasthecentralgoal.
13TheManpowerDemonstrationResearchCorporation(MDRC)wasfoundedin1974tomanagetheNSWDstudy.Foradetailedsummaryoftheprogramandfindings,seeManpowerDemonstrationResearchCorporationBoardofDirectors(1980).14TheNSWDhasbeenexaminedbyanextensiveliterature,includingLalonde(1986),DehejiaandWahba(2002),andSmithandTodd(2005).
37
However,areanalysisofthelong-termeffectsofGAINbyHotzetal.(2006)found
thattheeffectsinRiversideCountywereshort-livedrelativetothoseinLosAngeles
County,whichfocusedmoreonhumancapitaldevelopmentandwhereeffectswere
initiallysmallerbutroseovertime.15TheOhioprogramwassimilarindesignbut
encounteredmoreproblemsinimplementation,andyieldedsmallereffects.
Anexceptiontothetrendtowardsevaluationofstate-levelorlocaltraining
programswasthelarge-scale,nationalevaluationofthemainfederaltraining
programaimedatlow-incomeadultsanddisadvantagedyouth–theNationalJob
TrainingPartnershipAct(JTPA)Study.TheJTPAwasafederalprogramenactedin
1982,andwasadministeredatthestateandlocallevel.JTPAtrainingprograms
providedemploymenttrainingforspecificoccupationsandservices,suchasjob
searchassistanceandremedialeducation,toroughlyonemillioneconomically
disadvantagedindividualsperyear.Whiletheprogramandsomeserviceswere
administereddirectlybyJTPAstaff,trainingwasprovidedthroughlocalservice
providers,suchasvocational-technicalhighschools,communitycolleges,
proprietaryschools,andcommunity-basedorganizations.Traininglastedthreeto
fourmonths,onaverage,butdurationvariedwidelyacrossindividualsandprogram
sites.
Congress,inpartrespondingtolimitationsofnon-experimentalevaluations
ofthepredecessorprogramtoJTPA,theComprehensiveEmploymentandTraining
Act,mandatedarandomizedevaluationofJTPAin1986.Controlsubjectswere
15Hotzetal.(2006)alsopointoutthatthetreatmentgroupwasselecteddifferentlybetweenthefourGAINsites,possiblycontributingtotheestimated‘site’effects.Forexample,theRiversideCountyRCTsampleincludedasmallerfractionofthemoredisadvantagedwelfarerecipients.
38
excludedfromobtainingJTPAservicesfor18months.Toassessshort-andmedium-
termprogramimpactsonemploymentandearnings,theevaluationbothcollected
surveydataanddrewfromadministrativestate-levelrecords.16Theevaluationtook
placeat16JTPAprogramsites(socalledServiceDeliveryAreas,SDAs).
ParticipationbySDAsintheevaluationwasvoluntary,andsomeSDAsobjectedto
randomlyexcludingeligibleapplicants.TheparticipatingSDAsdidnotdifferfrom
othersinobservablecharacteristics(e.g.,Bloometal.1997),butmayhavediffered
inunobservedwaysthatwouldberelevanttoanextrapolationtotheoveralleffect
ofthenationalprogram.
AnexplicitgoaloftheJTPAevaluationwastoobtaindifferentialimpactsfora
widerangeoftargetgroups,includingadultwomen,adultmen,femaleyouths,and
maleyouthwithandwithoutanarrestrecord.Adultwomensawthelargest
earningsgains,followedbyadultmen;effectsonyouthweresmallerandgenerally
notsignificant(thoughthereweresignificanteffectsonattainmentofhighschool
diplomasforbothadultwomenandfemaleyouth).Inadditiontodemographic
subgroupanalyses,heterogeneityinprogramimpactswasestimatedalongseveral
otherdimensions,includingJTPAservicesrecommendedbyprogramintakestaff,
ethnicityandpriorlabormarketexperience.Whilethesubgroupeffectsofinterest
werelargelypre-specified,thisdoesnotfullyeliminatemultiple-comparisons
problems,particularlywhenthenumberofpre-specifiedcomparisonsissolarge,
andthusthereisanenhancedriskofafalsepositive.
16SeeBelletal.(1994)andBloometal.(1997)fordescriptionsoftheJTPAevaluation.ThereisasubstantialliteratureontheevaluationoftheJTPAprogram.SeeHeckman,Lalonde,andSmith(1999)forasummary.
39
Jobtrainingevaluationsslowedafterwelfarereforminthemid-1990s,then
begantopickupagainintheearly2000s.Someevaluationsinthisperiodfocused
onsector-specificemployment,suchastheSectoralEmploymentImpactStudy(e.g.,
Maguireetal.(2010)andevaluationsofsimilarsmaller,localprograms.17There
wasalsoarandomizedevaluationofcombinedtrainingandjobplacementservices
undertheWorkforceInvestmentAct(WIA)from2005to2015(theWork
AdvancementandSupportCenterDemonstration),andmorerecentlyastudyofthe
returnfromcommunitycollegeattendanceundertheTradeAdjustmentAssistance
CommunityCollegeandCareerTraining(TAACCCT)GrantsProgram.
Adistinctbroadstrandofrandomizedevaluationsoftrainingprograms
focusesonlow-incomeyouths.Again,theseprogramsofferabroadrangeof
differenttypesoftrainingaugmentedbyvaryingcombinationsofsupportservices.
Socialexperimentsinthisareahaveincludedarangeoffederallyandnationally
fundedevaluationsrangingfromtheearly1980stothemid-1990sthatculminated
intheNationalJobsCorpsStudy,describedbelow.Asinotherjobtrainingstudies,
thepaceofexperimentationslowedinthemid-1990s,butseveralnewstudieswere
undertakeninthemid-2000s.Somerandomizedevaluations,suchasNewYork
City’sSummerYouthEmploymentProgram(strictly,anaturalexperiment,as
randomizationispartoftherationingprocessandnotadecisionmadeinorderto
facilitateanevaluation),areongoing.Again,thebroadtrendwasfromafederal
17Theseinclude,amongothers,theGeorgiaWorksprograms,ProjectQuestinSanAntonio,theWisconsinRegionalTrainingPartnershipinMilwaukee,PerScholasinNewYorkCity,andtheJewishVocationalServiceinBoston.
40
monopolyonfundingtowardsagreaterinvolvementoflocalandprivatefunding
sources.
Thelargestandperhapsbestknownstudyofatrainingprogramfor
disadvantagedyouthsistheNationalJobsCorpsStudy.TheJobCorpswascreatedin
1964aspartoftheWaronPoverty,andcurrentlyoperatesundertheprovisionsof
theWorkforceInnovationandOpportunityActof2013,whichconsolidated
programsauthorizedundertheWorkforceInvestmentActof1998.JobCorps
servicesaregearedtowardseconomicallydisadvantagedyouthsaged16to24.Core
servicesaredeliveredbyaJobCorpscenter,usuallyresidential,andinclude
vocationaltraining,academiceducation,residentialliving,healthcare,andawide
rangeofotherservices,includingcounseling,socialskillstraining,healtheducation,
andrecreation.18Aboutaquarteroftheover100centersareoperateddirectlyby
theU.S.government,withtheremainderoperatedbyprivatecontractors.The
averagedurationoftheprogramiseightmonths,thoughbyitsphilosophythe
durationrespondstotheparticipant’sneedsandactualdurationvarieswidely.For
sixmonthsaftertheyouthsleavetheprogram,placementagencieshelpparticipants
findjobsorpursueadditionaltraining.
TheJobCorpsevaluationwasbasedonanexperimentaldesigninwhich,
withafewexceptions,allyouthsnationwidewhoappliedtoJobCorpsinthe48
18Themajorityoftrainingisvocational,andcurriculaweredevelopedwithinputfrombusinessandlabororganizationsandemphasizetheachievementofspecificcompetenciesnecessarytoworkinatrade.Academiceducationaimstoalleviatedeficitsinreading,math,andwritingskillsandtoprovideaGEDcertificate.AlthoughmostJobCorpsservicesareresidential,therehavebeennonresidentialparticipants(mostlywomenwithchildren).Therehavebeeneffortstoevaluatenon-residentialJobCorpsservices(e.g.,GreenbergandShroder2004,Schochetetal.2008).
41
contiguousstatesbetweenNovember1994andDecember1996andwerefoundto
beeligiblewererandomlyassignedtoeitheraprogramgrouporacontrolgroup.
ProgramgroupmemberswereallowedtoenrollinJobCorps;controlgroup
memberswereexcludedforthreeyearsafterrandomassignment.Thecomparisons
ofprogramandcontrolgroupoutcomesrepresenttheeffectsofJobCorpsrelativeto
otheravailableprogramsthatthestudypopulationwouldenrollinifJobCorpswere
notanoption.19Thecontrolandtreatmentgroupsweretrackedwithaseriesof
interviewsimmediatelyafterrandomizationandcontinuing12,30,and48months
afterrandomization.
TheevaluationofJobCorpsfollowedtheoutcomesofover15,000
experimentalsubjectsforuptoeightyearsusingsurveyandadministrativedata.
Theeffectoftrainingonearningsbecamegraduallypositiveasindividuals
graduatedfromtheprogram,andthenremainedstatisticallysignificantlydifferent
fromthecontrolgroupforuptofouryearsafterwards.Atthesametime,
governmenttransfersandcrimeratesfell(e.g.,Schochetetal.2008).Therewas
substantialheterogeneityinoutcomes–theeffectswerestrongestforthose20-24
yearoldatthetimeoftraining,andweakestforHispanics.
Aconcernwiththesefindingswasthattheoveralllevelofearningsandthe
sizeofthetreatmenteffectswerequitedifferentintheadministrativedatathanin
thesurveydata.Whilesurveydataaremoretobeaffectedbyendogenousattrition,
administrativedataarenotapanacea:Theyexcludeunder-the-tableemployment,
19Ofcourse,ifJobCorpsdidnotexist,theecosystemofotheravailableprogramswouldpresumablychange.ThisisformallyaSUTVAviolation,andimpliesthatcontrolgroupmeanoutcomesmaynotequalwhatwouldbeseenintheabsenceoftheprogram.
42
whichmaybecommonintheJobCorpspopulation.20Theyalsocannotaddressthe
problemthatwagesareobservedonlyforthosewhoareemployed,itselfan
intermediateoutcomeoftheprogram(e.g.,Lee2009)
AnimportantquestionregardingJobCorpsistherelativeperformanceofthe
differentJobCorpscenters,whichoperateindifferentlabormarketsandare
(sometimes)runbycontractorsratherthandirectlybythegovernment.Schochet
andBurghardt(2008)usetheJobCorpsevaluationdatatoestimateseparate
treatmenteffectsbysite,findingthatthesearenotstronglycorrelatedwiththenon-
experimentalmeasuresthathavebeenusedtoassesssiteperformance.
AfinalissueintheJobCorpsevaluation,nottoourknowledgeaddressedin
theliterature,isthattheprogrammaybelargerelativetotherelevantlabor
markets,creatingthepossibilityofimportantspilloversfromtreatedtocontrol
studyparticipants.
Afinal,smallercategoryoflarge-scalesocialexperimentsoftraining
programsfocusedspecificallyonunemployed(displaced)workers.Aswewill
discussbelow,someoftheseRCTsevaluatedprogramsprovidingabroadarrayof
reemploymentservicesthatalsoincludedsomedegreeoftraining.Thisraisesa
similarissuetowhatwehighlightedabovewithwelfareexperiments–experimental
evaluationsgenerallyidentifythe“blackbox”effectoftheoverallprograms,butnot
thecomponentsormechanismsresponsibleforthoseeffects.
20KornfeldandBloom(1999)showthatthisisthecaseforparticipantsintheJobTrainingPartnershipAct(JTPA)evaluation.
43
TheIndividualTrainingAccount(ITA)Experimentrunningfrom2001to
2005directlyevaluateddifferentmodesoftrainingprovisionprescribedbythe
1998WorkforceInvestmentAct.WIAallowedlocalagenciestoimposedifferent
degreesofcounselingandsupervisionofworkers’trainingchoices,andtheITA
experimentevaluatedtheeffectofthesechoicesonactualtrainingreceivedand
labormarketoutcomes.Effectively,theITAexperimentcomparedthreeservice
models.GuidedChoiceandMaximumChoicehadstandardizedsubsidiesfor
training,buttheformerrequiredcounselingbyacaseworkerwhilethelatterhad
nocounselingrequirement.Athirdmodel,StructuredChoice,waseffectivelylike
GuidedChoicebutofferedindividualized,andtypicallymoregenerous,training
awards.21
Thefindingsindicatedthateithermoregenerousawards(StructuredChoice)
orlesscounseling(MaximumChoice)ledtoahigherincidenceoftraining(Perez-
Johnsonetal.2011).EarningsincreasedforworkersinStructuredChoicerelativeto
GuidedChoicefiveyearsafterthetreatment.(Earningseffectswerehigherbutnot
statisticallydifferentforMaximumChoicerelativetoGuidedChoiceortoacontrol
group.)WhileStructuredChoicewasestimatedtobecostefficienttosociety,itwas
moreexpensivefortheworkforcesystem,andmostagenciesadoptedGuided
Choiceastheleadingmodel.Morerecently,anongoingexperiment(theWIAAdult
andDislocatedWorkerProgramsGoldStandardEvaluation,discussedbelow)
evaluatesdirectlytheintensiveandtrainingservicesprovidedunderWIA.
21Originally,underStructuredChoicecaseworkersweresupposedtoplayamoreactiveroleintrainingchoice.However,mostcaseworkersdidnotfeeltheyhadenoughknowledgeoflocallabormarketsortheworker’sskillstotakeonsuchanactiverole.
44
Anissuethatiscommontoallofthejobtrainingexperimentsisthe
possibilitythatindividualsassignedtothecontrolgroupmayhavereceivedtraining
throughotherchannelsthatwouldnotnecessarilyhavebeentrackedinthe
experimentaldata.Thesehiddentreatmentsarelikelytoattenuatetheestimated
trainingeffects–insofarascontrolparticipantsarereceivingsubstitutetreatments,
theevaluationsidentifyonlythedifferentialeffectofthepublictrainingprogram,
ratherthantheoveralleffectoftrainingrelativetonone.Whilethiscouldpartly
explainlowestimatedtreatmenteffects,thishasnotbeenexaminedcarefullyinthe
literature(though,aswediscussbelow,ithasreceivedsubstantialattentioninsome
otherdomains,mostnotablytheevaluationofearlychildhoodeducation).
Althoughabroadrangeoffindingsfromdifferenttreatmentsmakesithard
togeneralize,twothemeshaveemergedfromtrainingprogramsocialexperiments.
First,whiletrainingforlessadvantagedadultsandtheunemployedcanhave
beneficialeffects,mosttrainingprogramsfordisadvantagedyouthsfailtoachieve
strongresults.AnimportantexceptionisJobCorps,whichhasshownshort-and
medium-termpositiveeffectsforatleastsomeofitsparticipants.Second,theeffects
oftrainingtendtoaccruegraduallyovertime,makingthemhardtodetectin
researchdesignsthatcombinemultipletreatmentsorthatdonothavesufficient
dataorsamplestopreciselyestimatemedium-tolong-termeffects.
c. JobSearchAssistance
FromtheinceptionofwelfareprogramsintheU.S.itwassuspectedthat
neitherbetterworkincentivesnorbetterhumancapitalwouldbesufficienttoplace
45
hard-to-employwelfarerecipientsordisadvantagedyouthintolastingemployment,
andthatpartofthechallengederivedfromdisconnectionfromtheworldofwork.
Atthesametime,itwasnotclearwhichofarangeofsupportservicesaidingjob
placementwouldbeeffective.Hence,alargenumberofRCTshaveevaluatedarange
ofjobsearchassistance(JSA)programsforlow-incomeworkersandyouth.Other
studieshavefocusedonunemploymentinsurancerecipientsandotherunemployed
workers,whohavetraditionallybeeneligibleforsearchassistancefromtheU.S.
government.Hence,whiletrainingevaluationshavemostlyconcernedprograms
aimedatlow-incomeworkers,jobsearchassistanceexperimentshaveevaluated
programsgearedtowardsawiderrangeofunemployedworkersfromthemid-
1970stotoday.Asintrainingevaluations,however,animportantchallengein
studiesofjobsearchassistanceismeasuringthecounterfactual:Whatsortof
assistance,ifany,wasreceivedbythoseexcludedfromtheprogramunderstudy?
AnearlywaveofJSAprogramexperimentsgearedtowardswelfare
recipientsoccurredfromtheearly1970stothemid-1980s,alongsidesimilar
studiesoflaborsupplyandtrainingprogramsaimedatthesamepopulation.These
weremostlyevaluationsoflocalprogramsfundedbythefederalgovernment.There
isalonghistoryofprogramsprovidingplacementandtrainingservicesforwelfare
recipientsintheUnitedStates,goingbackatleasttotheWorkIncentiveProgram
(WIN)initiatedin1967.WINwascriticizedonarangeoffronts(e.g.,Gold1971).
Thefirstwaveoffederally-fundedevaluationstestedservicesprovidedbytheWIN
programandalternativeprogramsforWIN-eligiblewelfarerecipients(e.g.,
GrossmanandRoberts1989).TheseculminatedintheNationalEvaluationof
46
Welfare-to-WorkStrategies(NEWWS)in1990,whichwasalarge-scaleevaluation
of11programscombiningJSA,training,andenforcementofjobsearch
requirementsin7differentsitesintheU.S.
TheresultsfromrandomizedevaluationofdifferentWINserviceswere
mixed(e.g.,GreenbergandShroder2004).Theevaluationofso-called“jobclubs”in
1976-1979showedsubstantialincreasesinemploymentandreductioninwelfare
receipt.Asresult,jobclubsbecameanintegralpartofservicesreceivedbywelfare
recipients.However,theevaluationwasbasedonarelativelysmallsample,follow-
upwaslimitedtooneyear,andtheresultsindicatedsubstantial,hard-to-explain
heterogeneityinthefindingsacrosssubgroupsandtreatmentsites.Incontrast,the
evaluationsdiscussedinGrossmanandRoberts(1989)showlessconsistenteffects
ofJSAundertheWINprogram.
ThemuchlargerevaluationofNEWWSfoundshort-termincreasesin
employmentandreductionsinwelfarereceipt.Theseeffectsdissipatedduringthe
fiveyearfollow-upperiod.Asinotherevaluationsoccurringintheearlytomid-
1990s,suchasGAINdiscussedabove,thismaybedueinparttothehigh-pressure
labormarketofthe1990s.Thepresenceofsuchcyclicaleffectsisapotentially
importantconfounderlimitingtheinterpretationoftheeffectsoflabormarket
programstudies.
Asecondwaveofexperimentsoccurredintherun-uptowelfarereformin
themid-1990s,andagainsawsubstantialstate-levelinvolvement.Aswithlabor
supplyandtrainingstudiesinthisperiod,thesestudiestendedtostudy
contemplatedchangestoexistingprogramsandtoinvolvelargesamples.These
47
includedProjectIndependenceinFloridain1990(over13,000treatmentand4,000
controlsubjects),theIndianaWelfareReformEvaluationin1995(over67,000
treatmentand4,000controlsubjects),andtheLAJobsFirstGAINevaluationin
1995(over15,000treatmentand5,000controlsubjects).22Amongthese,onlythe
GAINevaluationdiscussedaboveallowsinferenceabouttheroleofJSAalone.The
findingsconfirmsthatJSAcanyieldsubstantialgainsinemployment,atleastinthe
shortterm.
Inparallel,anothergroupofexperimentsevaluatedJSAservicesprovidedto
recipientsofunemploymentinsurance.Mostoftheseincludedacombinationof
directjobsearchassistance,instructionsonhowtosearchforajob,andverification
ofjobsearch.Theseexperiments,toalargeextentdiscussedinMeyer(1995),
includedNevada(1977,1988),Charleston(1983),Texas(1984),NewJersey(1986),
andWashingtonState(1986).Anothersetofexperimentsduringsameperiod,
assessedonlytheeffectofverificationofjobsearchrequirements.Ashenfelter,
Ashmore,andDeschenes(2005)discussexperimentsinConnecticut,Massachusetts,
Tennessee,andVirginia.23
AssummarizedbyMeyer(1995),acorefindingofthesestudiesisthatJSA
reducesunemploymentinsurance(UI)receipt,atleastintheshortrun.Theeffects
aresmall,butcosteffectivefromthepointofviewoftheUIagency.Theeffectson
earningstendtobeimprecise,consistentwiththepossibilitythattheprogram
22TherealsohavebeenevaluationsofJSAservicesexplicitlydirectedatlow-incomeyouth,butmostsuchRCTsthatwefoundwererelativelysmall.TheevidenceonthissubjectquotedmostfrequentlyisrelatedtothejobsearchcomponentprovidedintheJTPAandJobsCorpsprograms.23OthersuchexperimentsincludeMinnesota(1988),Maryland,(1994),andWashingtonD.C./Florida(1995-1996),seeGreenbergandSchroder(2004).
48
impactsderivefromworkerswholeavetheUIsystemwithoutfindingjobs.Littleis
knownaboutwhichcomponentsofJSAmatter.ExperimentsinNevadaand
MinnesotasuggestthatintensiveJSAhasmuchstrongereffectsthandomore
limitedtreatments.Thereismixedevidenceastowhethertheverification
requirementalonematters:TheexperimentsdiscussedinAshenfelteretal.(2004)
indicatenoeffects,whileaMarylandstudysummarizedinKlepinger,Johnson,and
Joesch(2002)did.Thisquestionisakeyaspectofongoingevaluationsofthe
ReemploymentandEligibilityAssessmentsystem,discussedbelow.
SincethisearlywaveofUIexperiments,thecomponentoftheUIsystem
offeringjobsearchassistanceandtraininghasbeenrepeatedlyreformed,with
severalevaluationsalongtheway.TheWorkerProfilingandReemployment
Services(WPRS)programwasinstitutedin1993.UndertheWPRSstatesare
requiredtoprofiletheirUIclaimantsinordertoidentifythosemostlikelyto
exhaustUIbenefitsandreferthemtoemployment-relatedservices.24Thisprogram
wasevaluatedviaanaturalexperimentinKentuckybeginningin1994(Black,
Smith,Berger,andNoel2003,Black,Galdo,andSmith2007).Thefindingsfromthe
WPRSstudysuggestthatreceivingaletteraskingindividualstocomeintotheoffice
forJSAservicesalonereducesUIreceiptandraisesearnings.Animportantopen
24Theservicesinclude(1)anorientationsessiontoexplainwhatreemploymentservicesareavailable;(2)anassessmentoftheclaimant’sspecificneeds;and(3)developmentofanindividualplanforservicesbasedontheassessment.Claimantsreferredtoreemploymentservicesmustparticipateinthemasaconditionofcontinuingeligibility.Allowableservicesincludejobsearchassistanceandjobplacementservices,suchascounseling,testing,andprovidingoccupationalandlabormarketinformation;jobsearchworkshops;jobclubsandreferralstoemployers;andothersimilarservices
49
questioniswhetherthisinfluentialfindingisreplicatedinatrueRCTandinless
favorablelabormarketconditions.
TheWorkforceInvestmentAct(WIA)of1998combinedmostjobplacement
servicesandtrainingservicesprovidedundertheauspicesofthefederal
governmentunderoneroof,theso-calledone-stopcenters(e.g.,Jacobson2009).
Thesecenters,renamedAmerica’sJobsCentersin2012,provideboth“core”
employmentservices(e.g.,jobsearchassistance)and“intensive”WIAservices(e.g.,
careercounselingandtraining)tothethreecoreconstituencies–unemployed
worker,welfarerecipients,andhard-to-employyoungworkers.
Asthestructureofserviceprovisionhasevolved,additionalRCTshave
evaluatedthesystem’seffectivenessatplacingworkers.Forexample,in2005the
DepartmentofLabor’sEmploymentandTrainingAdministrationlauncheda
programcalledReemploymentandEligibilityAssessment(REA),mandatoryin-
personvisitsaimedatspeedingthereconnectionofUIclaimantstotheworkforce.25
TheREAmeetingincludesaneligibilityreview,provisionoflabormarket
information,developmentofareemploymentplanandreferraltomorespecific
reemploymentservices.Thefirstwaveofrandomizedevaluationofthe
effectivenessoftheREAcounselingprocesstookplaceinninestatesbeginningin
2005;asecondwaveofevaluationstookplaceinfourstatesin2009.Inbothcases,
theevaluationsfoundthattheREArequirementandservicesreduceUIbenefit
25TheREAprogramwasinstitutedtocounteractthetrendtowardsprocessingofUIclaimsbytelephoneandtheinternet.Theconcernwasthattheneteffectofthesechangeswastoreducein-personcontactandhencetheopportunitytomonitorjobsearchactivityandorientUIclaimantstoservicesavailabletospeedtheirreemployment(e.g.,O’Leary2006)
50
receipt(Benusetal.2008,Poe-Yamagataetal.2011).Earningsoutcomeswere
studiedinonlyonestate(Florida),andwerepositive.AnongoingREAevaluation
examinesthedifferenceintheeffectofenforcingtheinterviewrequirementalone
relativetothecombinedeffectoftheinterviewplusservices(Klermanetal.2013).
Asimultaneousevaluationbegunin2011,theWIAAdultandDislocatedWorker
ProgramsGoldStandardEvaluation26complementstheevaluationsofREA,WPRS
andearlierJSAprogramsbyfocusingontheeffectivenessofWIA’sintensiveand
trainingservicesgearedtounemployedadultsnotcoveredbytheearlier
evaluations.
SummarizingthewiderangeofstudiesofJSAindicatesimportant
heterogeneityofeffectsbythepopulationtargeted.Forwelfarerecipients,a
difficultyinassessingtheeffectofJSAisthatmanyexperimentstestedJSAin
conjunctionwithotherprograms.Thosestudiesthatfocusmainlyontheeffectsof
JSA,suchastherandomizedevaluationsofWIN,NEWWSorGAIN,oftenfind
positiveeffectsonemploymentandearningsandnegativeeffectsonwelfarereceipt
(butmixedeffectsatbestontotalincome).Theseeffectstendtobeshort-runlived,
andlessisknownaboutthelonger-termoutcomes.Thereisalsolittleknownabout
thepotentiallyimportantroleplayedbycontext,suchaslocallabormarket
conditions.
InstudiesofJSAforUIrecipients,acommonresultisapreciselyestimated
butrathersmalleffect–e.g.,areductionofaboutoneweekofUIbenefits,withno
26Seehttp://www.mathematica-mpr.com/our-publications-and-findings/projects/wia-gold-standard-evaluation.
51
correspondingpositiveeffectonearnings–unlesstheservicesprovidedarevery
intensive.Thefrontierinthisareaisassessingtowhatextenttheseeffectsarise
fromthethreatofenforcementofservicerequirementsspelledoutbylaw,basicJSA
themselves,ormoreintensiveservices.
d. PracticalAspectsofImplementingSocialExperiments
Clearly,theimplementationoflarge-scalesocialexperimentsiscomplexand
facesarangeofpracticalhurdlesthatcanaffectthequalityoftheresults.Sections
II.candIVofthispaperfocusonanumberofdesignissuesthatcanlimittheability
ofevenanidealexperimenttoprovideanswerstothequestionsofinterest.
Beyondtheseconceptualdesignissues,therearesomecommonchallenges
andpracticalconsiderationsthathavecomeupoverandoverintheconductof
socialexperimentsinthelabormarket.Theseplayimportantrolesininfluencing
thetopicsandquestionsthatarestudiedviasocialexperimentsandininformingthe
studydesigns.
Onesetofchallengesderivesfromthefactthat,asnotedabove,oneofthe
definingcharacteristicsofsocialexperimentsisthattheyintendtoexamine
programsthatarealreadyinplaceormightbeputinplaceinessentiallythesame
formthatwasusedintheexperiment.Forthispurpose,theexperimentalsamples
andhencethesamplingframeneedtoberepresentativeofthepopulationthatthe
programserves.Thisisachallengeinthecaseofmanylabormarketprograms,in
52
partbecausethesamplingframeisoftenavailableonlytoprogramoperatorsorthe
government,andmaybedifficulttoaccessduetoformalapprovalprocesses.
Oncethesamplingframeisobtained,itisnecessarytorandomlyassignsome
membersofthesampletotheprogramofinterestandotherstoacontrolcondition,
whichmightbeexclusionfromtheprogramoranalternativeprogramdesign.This,
too,canbedifficultwhentheprogramisalreadyinplace.Forexample,ifthe
programinquestionexistswithinanecosystemofotherprograms,services,and
serviceproviders,itmaybehardtoexcludeparticipantsfromtheprogramor,ifthis
isdone,toavoidalsoexcludingthemfromotherprogramsthatareadministratively
integrated.Forexample,excludingaparticipantfromjobsearchassistanceoffered
undertheWorkforceInvestmentAct(WIA)mightalsoinpracticeexcludehimor
herfromjobtrainingandotherprograms,asthesameofficesthatprovidejob
searchassistancealsodoscreeningandreferralsforotherservices.Whilesomeof
theseproblemsmightbereducedbystudyingprogramsnotalreadyinplace,asin
thecaseoftheNegativeIncomeTaxexperimentsortheNationalSupportedWork
Demonstration,thiscanbequitecostly,asthesortsofprogramstypicallystudied
involvesubstantialprogramcosts–commonlyinthethousandsofdollarsper
participant.
Asecondgroupofchallengeshastodowiththedifficultyofenforcing
compliancewithrandomizationafteritisconducted.Again,theuseofactual
programstestedinreal-worldsettingslimitstheoptions.Acommonchallengein
earlyexperimentswasthatservicedeliverywasdelegatedtoindividualcase-
workersorsitesthatwerebothwidelydispersedandnotcloselyinvolvedwiththe
53
experimentaldesign.Thisraisesthepossibilitythatcaseworkersmaydeviatefrom
randomassignment,forexampleensuringthatapotentialparticipantviewedas
especiallyneedyisnotassignedtothecontrolgroup.Forexample,akeyconcernin
theNationalJobCorpsStudywastoensurethatlocalprogramoperatorsproperly
implementedtherandomization.Modernpracticecentralizestherandom
assignmentprocess,carefullytrackingparticipants’initialassignmentstoensure
thatparticipantsassignedtoundesirabletreatmentconditionsdonotre-enterthe
randomizationtoobtainabetterassignment.27
Athirdsetofchallengeshastodowiththemeasurementofparticipant
outcomes.Onceagain,thischallengederives,inlargepart,fromtheuseofreal-
worldpopulationsasexperimentalsubjectsandfromthelargeandheterogeneous
subjectpoolscommoninsocialexperiments.Thesemakeitmoreexpensiveto
ensurehighresponseratesthaninsmallerandmoretargetedfieldexperiments.
Inmanycasesthischallengecanbeaddressedbyusingadministrativedata
tomeasuresomeoutcomes.Administrativerecordsmaycomeeitherfromthe
programunderstudy–forexample,unemploymentinsurancepaymentrecordsfor
studiesofjobsearchincentivesforunemploymentinsurancerecipients–orfrom
otherrecordsfromothergovernmentprograms(e.g.,taxrecords).Whilethiscan
resolvetheattritionproblematlowcost,itisoftencontingentongovernment
cooperationorapproval.Suchcooperationismorelikelyinlarge-scalesocial
experimentalevaluationsofexistingprogramsthaninothertypesofstudies.
27Foradiscussionofapproachestoaddressthisproblem,includingrelatedsoftware,see,e.g.,Creponetal(2013).
54
Administrativedatacanalsolimitthesetofimpactsthatcanbestudied,potentially
creatingimportantambiguitiesintheinterpretationofestimatedtreatmenteffects.
Intheunemploymentinsurancecase,forexample,itisnotclearwhetheranegative
effectofincreasedjobsearchenforcementonunemploymentbenefitpayments
indicatesthatpeoplearefindingjobsfaster,orjustthatmanypeopleareleavingthe
programbeforefindingjobsasawayofavoidingonerousenforcementprocedures.
IV. GoingBeyondTreatment-ControlComparisonstoResolveAdditional
DesignIssues
Whetheroneisinterestedinstructuralparametersorprogramevaluation,
manyquestionsofpolicyorscientificinterestinlaborandpubliceconomicsrequire
goingbeyondthebasicRCTdesigndescribedinSectionII.a.Wediscussedanumber
ofthesequestionsinSectionII.c.Here,wediscusswaystoextendthebasicRCT
designtoprovideanswerstothesequestions.
Weorganizeourdiscussionaroundthemajorpotentialdesignissueswe
mentionedinSectionII.c.Foreach,wediscussproposedsolutionsand,where
relevant,pointoutpotentialextensionsandlimitations.Webeginbydiscussing
studiesthataddressaspectsrelatingtointernalvalidity,includingSUTVAviolations
(e.g.,potentialgeneralequilibriumeffects)andendogenouslyobservedoutcomes.
Wethendiscussstudiesthataddressexternalvalidityconcerns,includingsiteand
sub-groupeffects;effectsonsubpopulationsotherthanexperimentalcompliers;
hiddenormultipletreatments;mechanismsfortreatmenteffects;andstudiesof
optimalorsimplyalternativepolicies.
55
Insomecases,theidentifiedissuescanbeaddressedexpost(afteran
experimentiscomplete),generallybyimposingadditionalstructure.Inmanyof
theseexamplestheadditionalstructureimposedisjustifiedbyappealtotheoretical
considerationsandisjustsufficienttoextendtheRCTtoaddressaspecificquestion
andthedesignissueitraises.Inthatsense,thestudiescanbeviewedasaneffortto
bridgepureexperimentalorquasi-experimentalapproaches,crediblyidentifyinga
limitednumberof(potentiallycomposite)causalparameters,withmoretraditional
structuralestimationthatobtainsafullercharacterizationoftheeconomicproblem
viatheimpositionofsubstantialadditionalassumptions.Intheidealcase,they
maintainthebestofbothworlds,thoughtheyalsosharesomeofthelimitationsof
each.
Anotherpossibilityistobuildthestructuralquestionsofinterestintothe
designoftheexperimentexante.Thiscanprovidecredibleidentificationwitheven
fewerstructuralassumptionsthanarerequiredforafter-the-factanalyses,though
cansometimesrequireaquitecomplex–andpotentiallydifficulttoadminister–
experimentaldesign.Therearefewerexistingexamplesofthis,butwediscussthem
whereappropriate.
Wediscusseachofthedesignissuesidentifiedearlierinturn.Ourdiscussion
ismeanttohighlightthedifferentapproaches,aswellastoclarifythescope,
potential,anddifficultiesthatarisewhenextendinginferencefromstandardRCTs
toabroaderrangeofquestions.
56
a. SpillovereffectsandSUTVA
Socialexperimentsinlaboreconomicstypicallyoccurinthecontextofthe
localorregionallabormarket.Ifthenumberofworkersparticipatinginthe
programislargerelativetotherelevantsegmentofthelabormarket,theprogram
couldhaveaneffectonthelabormarketoutcomesofthecontrolgroup.Thiswould
beaviolationofSUTVA–thedifferenceinoutcomesbetweentreatedandcontrol
individualswoulddifferfromtheoveralleffectoftheprogramontheentire
populationrelativetonotimplementingtheprogram,whichisoftentheeffectof
primaryinterest.
ManysocialexperimentsintheUnitedStateshavenotraisedserious
spilloverissues,asthetreatedpopulationshavebeensmallrelativetothelocal
labormarket.However,thismaynotbetrueforlargeexperiments,suchasthe
NationalJobsCorpsStudy.Welfareexperimentsmayalsocreatespillovereffectsif
labormarketsforformerwelfarerecipientsaresufficientlysegmented.
Arelatedissueisthatcomprehensiveprogramevaluationsinmanycases
shouldincludespillovereffectsthatarenotcapturedbysmall-scalepilotstudies.If
thepilotprogramsareeventuallyscaledtobroaderpopulationsoflow-income
workers–whichhashappened,amongothers,inthecaseofwelfarereform,of
trainingprovidedthroughWIA,orjobsearchassistanceservicesprovidedbyWPRS
orREA–thenthepotentialextentofspillovereffectswouldneverthelessmatter,
sinceanyspillovereffectwouldhavetobeincludedinawelfareassessmentofthe
program.Thiswouldcreatesystematicdifferencesbetweentheoutcomesofthe
pilotstudyandtheprogrameffectsofinterest.
57
i. Addressingtheissueexpost
Despiteitspotentialprevalenceinsocialexperimentsinthelabormarket,
relativelyfewstudieshavedealtdirectlywiththeissueofspilloversorotherfailures
ofSUTVA.Ahandfulofstudieshavetriedtoestimatespillovereffectsdirectlyusing
inter-regionalcomparisons(e.g.,Blundell,Dias,Meghir,andVanReenen2004;
Ferracci,Jolivet,andvandenBerg2010;Gautier,Muller,Rosholm,Svarer,andvan
derKlaauw2012).Thereareroughlytwoapproaches,neitherofwhichisableto
fullyidentifythespillovereffect.Oneapproachistocomparecontrolgroup
outcomestothoseofobservablysimilarindividualsinareaswherenooneis
treated.Ofcourse,theremaybeotherexplanationsfordifferencesseeninthis
observationalcomparison.Anotherapproachistocomparetheeffectoftreatment
acrosssiteswithdifferenttreatmentintensityorlabormarketconditions.Thisis
againtypicallyanobservationalcomparison,asinmostcasesneitherthetreatment
sitenorthesizeofthetreatmentgroup(andhencetheamountofpotential
spillover)israndomlyassigned.Forexample,Hotz(1992)discussesthenon-
randomselectionofsitesfortheJTPAevaluations.Alcott(2015)studiesthesources
ofobservedbiasfromsite-selectioninalargeelectricityconservationexperiment.A
recentpaperbyCrepon,Duflo,Gurgand,Rathelot,andZamora(2013;seealsoBaird
etal.2015),discussedfurtherbelow,resolvesthisprobleminthecontextofajob
searchassistanceprogrambyrandomlyassigningboththetreatmentandthe
numberofworkerstreated.
Absentsuchamulti-stageexperimentaldesign,relativelyfewoptionsare
availabletoresearcherstoassessthedegreeoftheactualorpotentialspillover
58
effectspresentinthecontextoftheirevaluation.Anareaofresearchwherespillover
effectshavereceivedsubstantialrecentattentionistheanalysisoftheemployment
andwelfareimpactsofextensionsinunemploymentinsurancebenefits.Here,
spillovereffectsarisebecausetreatedanduntreatedindividualscompeteforthe
samepositions;thedegreeofthespillovereffectthereforedependsonthejob
creationresponsetothetreatedgroup’slaborsupplychange.Toassessthe
potentialdegreeofspillovers,onecaninprincipleuseestimatesofthematching
functiontoadjustmicro-econometricestimatesoftheeffectofpolicy-induced
changesinunemploymentinsurancedurationsonunemploymentdurationorexit
hazardsforthepresenceofcrowding.28Suchad-hocsimulationsarepartial-
equilibriuminnature,andcouldbeinterpretedasashort-runeffect,when
vacancieshavenotyetadjusted.Landais,Michaillat,andSaez(2015)specifya
generalequilibriummodelofthelabormarketthatincorporatesbothcrowdingand
vacancyresponses.Inastandard,competitivesearch-matchingmodel,thevacancy
responsetochangesinlaborsupplyissufficientlystrongtooffsetthecrowding
effectcompletely.
Inthespiritofusingrandomvariationinthetreatmentacrosslocalitiesto
assessthepresenceofspillovereffects,acoupleofrecentpapershavetriedto
exploitregion-specificchangesinpolicy-inducedUIvariationintheU.S.toassess
thefulleffectofthepolicyontheentirelabormarket(Hagedorn,Karahan,
Manovskii,andMitman2015,Hagedorn,Manovskii,andMitman2015).SinceUI
28OneaddeddifficultyinthecaseofUIisthatinmostcasesintheU.S.thepolicy-inducedchangesinthelevelordurationofUIbenefitsareafunctionoflabormarketconditions–makingitcrucialtoproperlycontrolforthedirecteffectoflocallabormarketconditions.
59
variationsusuallydependoneconomicconditionsatthestatelevel,thesestudies
usebordercommunitiesunaffectedbythepolicychangeascounterfactuals.29A
concernwiththisapproachisthatthepresenceofspatialspilloversbetween
adjacentorrelatedlabormarketareaswouldagainconstituteafailureofSUTVA.30
AnothersourceofSUTVAfailuresareinteractionsbetweentreatmentand
controlparticipants.Such‘dilution’effectscanleadtoanunderestimationofthe
treatmenteffect.Ifpossible,atypicalapproachtocircumventsuchinteractionsisto
raisethelevelofrandomization(say,fromasub-groupwithinasitetoawholesite).
Thisapproachcanhelptoavoidinteractionsbetweenindividualsinthetreatment
andcontrolgroups.Itdoesnotresolvepotentialinteractionsbetweentreated
participants.Thismaybepartofthemechanismofthetreatment;itmayalsobea
potentiallyunintendedsourceofvariationintreatmentintensitythatwediscuss
undersiteeffects.Ineithercase,whendesigninganevaluation,itwouldbevaluable
toconsiderwaysofkeepingtrackofsocialinteractions,perhapsbyaskingabout
friendsinabaselinesurvey,ormonitoring(ormanipulating)theuseofcertain
kindsofsocialmedia.Anothervaluabletargetfordatacollectionisfactorsrelating
tohowtreatmentwasobtainedortakeupwasdecided.Suchinformationmaybe
usedtostratifytheanalysisbythepredicteddegreeofSUTVAviolationsoratleast
assessthepotentialforsignificantdeparturesfromSUTVA.
29Akeypracticaldifficultythereisthatmeasuresofunemploymentratesatthesub-statelevelisoftenverynoisy.Estimatesusingadministrativeemploymentdatabasedontheuniverseofprivateemployeesshowlittlesignofspillovereffects(JohnstonandMas2015).30CerquaandPellegrini(2014)developalternativeestimatestotheTOTthattakeintoaccountthedegreeofspatialspillovereffects.TheHagedornetal.papershavebeenquitecontroversial;see,forexample,responsesfromChodorow-ReichandKarabarbounis(2016)andCoglianese(2015)
60
ii. Addressingtheissueexantethroughthedesignoftheexperiment
Insomecircumstancesitmaybepossibletoavoid,orstudy,spillovereffects
byappropriatelystructuringarandomizedexperiment.Forexample,inthespiritof
thenon-experimentalstudiescitedabove,treatmentandcontrolgroupscouldbe
chosentobesufficientlydistanttoavoidspillovereffects.Alternatively,the
treatmentgroupcouldbechosentobesufficientlysmallthatspillovereffectsare
unlikelytobeaproblem.Ifthespillovereffectsthemselvesareofdirectinterest,the
experimentalmanipulationcouldbecombinedwithpre-existingvariationinthe
strengthofpotentialspillovereffects(e.g.,acrosssubmarkets),ifavailable.Therisk
ofsuchadhocorhybridapproachesistopotentiallylosecomparabilityofthe
controlgroup,ortoconfoundspilloverwithothervariationintreatmenteffects.
Apreferableapproachifspillovereffectsarepotentiallypresentisto
manipulateboththetreatmentandthesizeofthetreatmentgroup(andhencethe
amountofspillover)experimentally.Bairdetal.(2015)developthisstrategy
formally.Crepon,Duflo,Gurgand,Rathelot,andZamora(2013)implementitinthe
contextofapublicprogramassistingunemployedworkersintheirsearchforajob
inFrance.Theresearchersmanipulatebothwhogetsassignedintothejobsearch
assistanceprogramwithinaregion(theclassicexperimentaldesign),aswellas
randomlyvarybetweenregionstheshareofindividualsassignedtothetreatment
group.Themanipulationofbothregionaltreatmentshareandindividualtreatment
statusallowsseparateexperimentalidentificationoftheeffectoftheprogram
holdingthespillovereffectconstantandthecombinedprogramandspillovereffects
atvarioustreatmentintensities.Thelatterparametersareultimatelyrelevantfora
61
cost-benefitorwelfareanalysisoftheprogramandforextrapolationtoalternative
policysettings.
SimilarstrategiesareavailableforotherSUTVAfailures,arisingforexample
ifsomeindividualsinthecontrolgroupgetaccidentallytreated,oriftreatment
compliancedependsonthetakeuprateamongpeers.Insomecases,onemay
choosetheexperimentalsettingtotrytominimizeSUTVAproblems.Forexample,
onecandevisestrategiestolimitthepotentialfornon-compliance(e.g.,incaseof
web-basedinformationtreatments,accesscouldbasedonhardwareaddressrather
thanpasswords).
Anotherpotentiallyinterestingstrategyistomakethedegreeandstructure
ofSUTVAviolationspartoftheanalysis,asinthediscussionofspilloversabove.This
mayprovideinsightsintothe“blackbox”ofhowaprogrammightworkinareallife
settingandhenceenhanceexternalvalidity.31Forexample,onecould
experimentallyvarythenumberoftreatedunitsinareferencegroupornetwork
(e.g.,classrooms,friends,etc.),examininginteractionsamongindividualtreatment
status,grouptreatmentshare,andperhapsalsopredeterminedfactors(suchasthe
tightnessofthegroup)thatdeterminethedegreeofdeparturefromSUTVA.
Dependingonthecontext,itmaybepossibletomoreexplicitlymanipulate
interactionsbetweenindividualsbyintroducinganadditionaltreatmenttothe
experimentaldesign–forexample,aforuminwhichinteractionsarefacilitated.
31Notethatthereisaparallelherewiththeissueoftreatmentcomplianceandheterogeneoustreatmenteffects.Here,thecompliancefunctionisassumedtodependontreatmentstatusofotherindividuals,andhenceexperimentallymanipulatingcomplianceprobabilitiesispresumablymorecomplex.Yet,asinthestandardcaseofheterogeneoustreatmenteffects,forexternalvalidityitisimportanttotraceoutthepotentialcompliance-relatedinteractionsasfullyaspossible.
62
b. Endogenouslyobservedoutcomes
Inmanylabormarketexperiments,keyoutcomesincludemeasures
observedonlyforindividualswhoareemployed,suchashoursworkedandwages.
Hence,theimpactof,say,welfare-to-workprogramsorjobtrainingprogramscan
onlypartiallybeassessedbasedonsimpleRCTsalone.Althoughmanystudies
reportexperimentalimpactsontheendogenouslyobservedoutcomes,theseare
understoodtosufferfromseriousselectionproblems.Inthesameway,non-random
attritioninfollow-updatacollectioncanbiastheresultsofnearlyanyevaluation.
Toillustrate,consideraprogramaimedatunemployedworkersthatincludes
skilldevelopmentandjobsearchassistancemodules.Weareinterestedinwhether
theprogramraisestheprobabilitythataparticipantisemployedoneyearafter
participationandwhetheritmakesthemmoreproductivewhenemployed.For
simplicity,weassumethatparticipationisrandomlyassignedandcomplianceis
perfect.
Wehavetwooutcomeshere.Wedenoteemploymentstatusbyyi=Diy1i+(1-
Di)y0i.Forthosewhoareemployedatthefollow-upsurvey,weobservethewagewi
=Diw1i+(1-Di)w0i.Treatmenteffectsoftheprogramonthetwooutcomesareτyi
andτwi.(Wecanimaginethatwdiiswelldefinedforanindividualwithydi=0,
d={0,1},butsimplynotobserved.Itcanbethoughtofastheindividual’slatent
productivity,thatwhichhe/shewouldbepaidifajobwerefound.)
EstimationofE[τyi]isstraightforward,asdiscussedabove.Buttheimpacton
wagesismuchharder.Ingeneral,itisnotpossibletoidentifytheaveragetreatment
effectE[τwi];thetreatment-on-the-treatedeffectE[τwi|Di=1];oreventheaverage
63
treatmenteffectforthesubpopulationthatwouldhavebeenemployedwithor
withouttheprogram(forwhomτwiisleastproblematic),E[τwi|y0i=y1i=1].
Theproblemhereisthatitisimpossibletodistinguish,withineachDigroup,
betweenthoseworkerswhowouldalsohaveworkedinthecounterfactualand
thosewhowouldnothave.Considerthetreatment-controldifferenceinmean
observedwages:
E[wi|y1i=1,Di=1]-E[wi|y0i=1,Di=0]=
=E[w0i+τwi|y1i=1,Di=1]-E[w0i|y0i=1,Di=0]
=E[τwi|y1i=1,Di=1]+(E[w0i|y1i=1,Di=1]-E[w0i|y0i=1,Di=0])
=E[τwi|y1i=1,Di=1]+
+(E[w0i|y0i=1,y1i=1,Di=1]-E[w0i|y0i=1,y1i=1,Di=0])
+(E[w0i|y1i=1,Di=1]-E[w0i|y0i=1,y1i=1,Di=1])
-(E[w0i|y0i=1,Di=0]-E[w0i|y0i=1,y1i=1,Di=0]).
Thefirsttermhereistheaveragetreatmenteffectinthesubpopulationthatworks
undertreatment.Itmaynotequaltheoverallaveragetreatmenteffect,butinsofar
asthepotentialwagesofthosewhodonotworkarenotrelevanttosocialwelfare,it
isarguablytheparameterofinterest.Thesecondtermsolelyreflectsselectioninto
treatment,andiszerounderrandomassignment.Butthethirdandfourthterms
havetodowithselectionintoemployment,notselectionintotreatment.Random
assignmentdoesnotensurethattheyarezero,andthetreatment-controlcontrast
64
amongworkersmaythereforebebadlybiasedrelativetotheimpactonwagesfor
anyfixedgroupofworkers.32
Onefallbackapproachistoexamineonlytheprogram’seffectontheshareof
participantsearninghighwages,treatinglow-wage-workersandnon-workersthe
same.Thiseffectcanbeestimatedwithoutbias.Anotherfallbackistoincludethe
non-employedinthewageanalysis,withwagessettozero.Thisinsomecasesisthe
impactofinterestinanycase,andiscorrectlyidentifiedbytheexperiment.
However,itisquitemisleadingifinterpretedasthemagnitudeoftheeffecton
productivity,eitherforthefullpopulationorforthesubgroupthatwouldhavebeen
employedwithorwithouttreatment.Withoutanabilitytomeasurecounterfactual
employmentstatusattheindividuallevel,thelattereffectsarenotidentified.
i. Addressingtheissueexpost
Non-randomattritioninparticularhasbeenalong-standingconcerninthe
experimentalliteratureinlaboreconomics(e.g.,HausmanandWise1979).Aclassic
experimentaldesignwouldbedeemedsuccessfulifattritionislowandbalancedin
termsofmagnitudeandobservablecharacteristicsbetweenthetreatmentand
controlgroups.Ifthisisthecase,reweightingthesamplesmaystillrecoverthe
32Consideratrainingandjob-searchassistanceprogram.Suppose60%ofworkerswillbealwayslowproductivity(w1i=w0i=wL),20%willbealwayshighproductivity(w1i=w0i=wH),and20%willbecomehighproductivityifexposedtothetrainingsequence(w0i=wL,w1i=wH).Allofthesecondandthirdgroupswillfindjobs,withorwithoutsearchassistance(y0i=y1i=1),butthoseinthefirstgroupoflow-skill,impossible-to-trainworkerswillfindworkifandonlyiftheyreceivesearchassistance(y0i=0,y1i=1).Inthissetting,theprogram’saveragetreatmenteffectonemploymentis0.6;theaverageeffectonlatentproductivityis0.2*(wH–wL);andtheaverageeffectonwagesofthosewhowouldworkwithorwithouttheprogramis0.5*(wH–wL).Theestimatedtreatmenteffectonwagesconditionalonemploymentis–0.1(wH–wL)<0.Selectionhasledtoaperverseestimatehere:Thetrainingprogramhasapositiveeffecton20%ofparticipantsandanegativeeffectfornoone,buttheexperimentappearstoindicatethatitreducesearnings.
65
effectoftheTOTorLATEamongtheoriginalsetofcompliers(e.g.,HamandLi
2011).Yet,therearerelativelyfewexplicitattemptsintheliteraturetoaddress
selectionbiasinothercontexts.
Alargeliteratureinlaboreconomicshasdealtwithsampleselection
problems,especiallyintheanalysisofwagesandhoursinthecontextoftheclassic
humancapitalandlaborsupplymodels.Largelybasedonthatliterature,herewe
willreviewseveralapproachestodealwithselectionbias:theuseofcontrol
functionstoaddressselection;estimationofpercentileeffectsinsteadofmean
impacts;useofadditionaldatatocontrolforselection;constructionofboundsbased
onselectionprobabilities;andconstructionofboundsusingtheory.
Parametricselectioncorrections
The‘classic’approachtocontrolforselectionbiasinestimatingtheeffectsof
treatmenteffectsonwagesorhoursworkedisbasedoncontrolfunctions.Labor
supplytheory,alongwithparametricassumptions,isusedtoderiveanexplicit
expressionfortheselectionbiasintermsoftheparticipationprobability,which
undermonotonicitydeterminestheamountofsampleselection.Thisisthen
accountedfordirectlyintheoutcomeequation(e.g.,Gronau1973,Heckman1979).
Earlyonitwasrecognizedthatabsentexperimentalvariationin
participation(e.g.,anexogenousinstrumentaffectingonlyparticipationandnotthe
outcomeequation),identificationisonlybasedonfunctionalformassumptions,and
resultscanbequitemisleadingiftheseassumptionsareevenslightlyincorrect.By
contrast,asubstantialliteraturehasshownthatonceaninstrumentfor
participationisavailable,treatmenteffectsintheoutcomeequationcanbe
66
identifiedunderquitegeneralfunctionalformanddistributionalassumptions(e.g.,
Newey,Powell,andWalker1990).Forexample,AhnandPowell(1993)showthat
underassumptionsofasingle,strictlymonotonicindexforselection,variationin
theprobabilityofparticipationindependentfromthevariablesintheoutcome
equationsufficestocontrolforselection.Thedifficultyis,ofcourse,thatoftensuch
independentsourceofvariationisnotavailable.
CardandHyslop(2005)consideraspecialcaseinwhichanRCTdoes
generateexogenousvariationinparticipation:Anemploymentsubsidyprogram.
Theyshowthatiftheprogramonlyhaspositiveeffectsonlaborsupplyanddoesnot
affectthewagesforthosewhowouldhaveworkedwithoutit,thentheexperimental
effectonthehourlywagecanbeconsistentlyestimatedbytheratioofthetreatment
effectontotalearningsdividedbythetreatmentontotalhoursworked.
CardandHyslop’sassumptionsareinappropriateforanyprogramdesigned
toaffectwagesandnotjustparticipation.Belowwediscusshowtheexperimental
designitselfmaybemodifiedtoobtainexogenousvariationinparticipation,evenin
programswitheffectsonmultiplemargins.
Non-andsemi-parametricselectioncorrections
Absentaninstrumentforparticipation,inthepresenceofselectionthe
treatmenteffectonmeanwagesisnotidentified.However,severalstudieshave
exploitedthefactthatundercertainassumptionsquantile-treatmenteffects(QTEs)
maybeconsistentlyestimatedeveninpresenceofselection.AQTEfortheq-th
quantileisdefinedasthedifferenceintheq-thquantileoftheoutcomedistribution
67
inthetreatmentandcontrolgroups,respectively.33Itisnotnecessarytoobserve
eachindividual’soutcometocomputetheq-thquantile;itsufficestoknowthat
someoneisaboveorbelowthatquantile.Thus,ifonecanassumethatallthosewho
arenotemployedhavepotentialwagesinthebottomqpercentofthedistribution,
onecanestimatethetreatmenteffectontheqthquantileofpotentialwagesby
merelyassigningallnon-workerstheminimumobservedvalue(e.g.,Powell1984,
Buchinsky1994).Hence,underthisassumptionallquantilesabovethevalueofthe
rateofnonemploymentoftherespectivegroupcanbeidentified.Thelowervalueof
nonemploymentofthetreatmentandcontrolgroupdetermineswhichQTEcanbe
identified.
Avariantofthisapproachistoexaminethesimpletreatment-control
differenceintheprobabilityofbeingobservedinemploymentwithawagegreater
thansomerelativelyhighthresholdw*.Formanyprogramevaluations,
understandingtheimpactonthisoutcomemaybesufficient–itmaynotmatter
greatlywhethertheimpactderivesfrommovingsomepeoplefromnon-
employmentintohigh-wageemploymentorfromsimplyliftingthosewhowould
haveworkedanywayintohigher-wagejobs.Andevenwhenthelattercomponentis
theoneofinterest,thiswouldbeidentifiedsolongasthosepulledintoemployment
bythetreatmenthavewagesthatareuniformlybeloww*.
33ForanyrandomvariableYhavingcumulativedensityfunctionF(y)=Pr[Y<y],theqthquantileofFisdefinedasthesmallestvalue,suchthatF(yq)=q.IfweconsidertwodistributionsF0andF1,thenQTE(q)=y(1)-yq(0),whereyq(g)istheqthquantileofdistributionFg.
68
Itisnotclear,however,thattherequiredassumptionholds–aspointedout
byAltonjiandBlank(1999),amongothers,atanygiventime,somehigh-wage
individualsmaybenonemployed.Moreover,thisstrategyisonlyusefulinsofaras
differencesinquantilesoftheoutcomearedeemedsufficientforevaluatingthe
effectoftheprogram.
Anotherapproachusesreservationwagestomeasureselectionintothe
subsampleofobservedwages.Thisworksbecause–ifcorrectlymeasured–the
reservationwagecapturesthelowestwageforwhichanindividualiswillingto
work.Hence,thereservationwageprovidesthecensoringpointforanindividual’s
wage-offerdistribution,allowingonetomakeinferencesaboutpotentialwagesfor
thoseindividualsnotworkinginthetreatmentandcontrolgroup.Johnson,
Kitamura,andNeal(2000)usetheminimumofallobservedwagesforanindividual
inlongitudinaldatatoboundthereservationwage,undertheassumptionthatitis
stableovertime.Grogger(2005)usesdirectlyreportedreservationwage
informationfromarandomizedevaluationofFlorida’sFamilyTransitionProgram,a
welfare-to-workprogramwithemphasisonworkincentivesandtimelimits.With
thisinformation,heestimatesthetreatmenteffectoftheprogramonwagesusinga
bivariate,censoredregressionmodelthatallowsforclassicalmeasurementerrorin
bothobservedwagesandreservationwages.OnceGrogger(2005)controlsfor
selection,hefindstheprogramhadstatisticallysignificantlypositiveeffectson
wages.
Addressingtheselectionproblemusingdirectmeasuresofreservation
wagesmakesintuitiveuseofthereservationwageconcept.Moreover,often
69
informationonreservationwagesisalreadybeingcollectedinthecontextof
programsprovidingjobsearchassistance,orifnottheyareatleastinprinciple
relativelyeasytoelicitiftheexperimentaldesignincludesasurveycomponent.
However,recentresearchsuggeststhatinpracticereportedreservationwages
appeartoonlypartlyreflectthepropertiesofthetheoreticalconcept(e.g.Krueger
andMueller2016),castingsomedoubtontherobustnessofthisapproach.In
particular,KruegerandMuellerreportthatasubstantialnumberofworkersaccept
(reject)jobsofferingwagesbelow(above)theirreservationwage,implyingthat
careshouldbetakeninusingreservationwagesofthenonemployedtomake
inferencesaboutunobservedwageoffers.
Yetanotherapproachistoattempttoderiveboundsforthetreatmenteffect
underconditionsmoregeneralthanthemonotonicityassumptioninherentinthe
AhnandPowell(1993)andsimilarestimators.Thisallowsresearchersto
investigatehowseverethebiasfromselectioncouldpossiblybeandwhatcanbe
learnedundergeneralassumptionsratherthantotryandtoobtainapointestimate
undermorerestrictiveassumptions.
OneboundingapproachisproposedbyHorowitzandManski(2000).This
strategyaskshowmuchtheestimatedtreatmenteffectwouldbeinflatedifall
missingtreatmentobservationswereassumedtohavethehighestpossible
outcomesandallmissingcontrolobservationsthelowest;thenitaskshowmuchit
wouldbedepressediftheoppositeassumptionsweremade.Unfortunately,these
boundsaretypicallynotverytight,particularlywhentheoutcomevariable’s
supportispotentiallyunboundedasforexampleinthecaseofwages.
70
Lee(2009)proposesastrategyforobtainingtighterbounds,viastronger
assumptions:Heassumesthatanyonenotemployedinthecontrolgroupwouldalso
havebeennon-employedhadtheybeeninthetreatmentgroup,sothatselection
biasarisessolelyfromparticipantsitthetreatmentgroupwhoareemployedbut
wouldnothavebeenhadtheybeenassignedtobecontrols.34Hecanthenboundthe
treatmenteffectbymakingextremeassumptionsaboutthislattergroup.Denotethe
excessfractionemployedintreatmentgroupbyp.Theupper(lower)boundis
constructedbyremovingthelowest(highest)fractionpobservationsfromthe
treatedsubsampleandrecomputingthemeanoutcomeforthetreatmentgroup–
effectivelymakingtheworst-caseassumptionthatselectionwasfullyresponsible
fortheentireupperorlowertailofvalues.Lee(2009)showsthattheresulting
boundsaresharpandprovidesformulasforthestandarderrors.InthecaseofJob
Corps,theprocedureresultsininformativeboundssuggestingpositivewageeffects
fromtraining–albeitazeroeffectiscontainedintheconfidenceinterval.
Lee’s(2009)approachbasedontrimmingrequiresrelativelyweak
assumptions.Itpresumesonlythatselectionismonotonicinthetreatment–that
treatmenteitheronlyincreases,oronlyreduces,selectionintoemployment.
Monotonicityisimpliedbystandardempiricalbinarychoicemodelstypicallyused
tomodelparticipationchoices(e.g.,Vytlacil2002),andhenceboundsbasedon
trimmingareapplicabletoawiderangeofproblems,includingselective
employment,surveynon-responses,orsampleattrition.
34Theroleoftreatmentandcontrolgroupsarereversedifthetreatmentreducesemployment.
71
Ifoneiswillingtoimposefurtherstructurefromtheory,onemayobtain
tighterboundsmorespecifictoaparticularproblem.Thisisespeciallyusefulifthe
theoryhasexplicitpredictionsabouthowtheendogenousoutcomerespondsto
incentives.35ThisispursuedbyKlineandTartari(2016),whoanalyzethe
randomizedevaluationofConnecticut’sJobsFirstwelfare-to-workprogram.While
previousanalyseshadfoundonlysmallresponsesinhours(theintensivemargin),
absentaninstrumentforparticipation(theextensivemargin)sampleselection
makessuchestimateshardtointerpret.KlineandTartari(2016)userevealed
preferenceargumentsinthecontextofacanonicalbutnon-parametricstaticlabor
supplymodeltodescribewhichobservedresponsestothetreatmentatthe
intensiveandextensivemarginareconsistentwiththetheory.Giventhenatureof
theprogramstudied,theresultisamappingofdiscretecounterfactualoutcomes
(includingnon-participationaswellasparticipationatdifferentintensities)under
treatmentandnon-treatment,withrestrictionsontheallowablecounterfactuals.
Thequestionthenishowlikelycertaintransitionsare,andinparticularwhether
changesattheintensiveandextensivemarginoccurwithpositiveprobabilities.
SinceKlineandTartaricanonlyobservethemarginaldistributionacrossstatesfor
thetreatmentandcontrolgroups,theycannotpoint-identifythetransition
probabilities.Instead,theyconstructboundsfortransitionprobabilitiesamongthe
entire(discretized)distributionofstates,includingtheprobabilityofchangesinthe
35Thismaybemoreeasilydoneforhours,whichistypicallyassumedtobeachoicevariable,thanforwages.Yet,tosomedegreewagemaybeachoicevariableaswell,forexampleifjobsofferwageandeffortcombinationsamongwhichworkerschoose.Thisistheapproachtakeninsomemodernpublicfinance,whichoftensubstituteshoursworkedwithtaxableearningsasthechoicevariableinanalysesofintensive-marginlaborsupply.
72
intensivemarginduetothetreatment.Theirapproachalsoallowsthemtotestthe
restrictionsfromthemodel.
Thisapproachisuseful,sinceitallowsKlineandTartari(2016)tolearn
aboutintensivemarginresponsestotheJobsFirstprograminthepresenceof
selection.Theirresultscouldalsobeusedtothinkaboutthelikelihoodofintensive
marginresponsesforsimilarprogramsinsimilarpopulations.Alternatively,the
estimatedboundsfromthematrixoftransitionprobabilitiescouldbeused,along
withthemarginaldistributionoflaborsupplyunderanexistingprogram(AFDC,the
programofthecontrolgroup),toconstructboundsfortheintensiveandextensive
laborsupplyresponsesthatcouldariseifJobsFirstwasimplementedatanother
site.Apotentialissueisthattheprocedureiscomplexandtheanalysisisspecificto
theJobsFirstprogram.Hence,whilethegeneralapproachmaybeapplicabletoa
rangeofproblems,thiswouldrequirecarefulspecificationofthedecisionproblem,
oftherestrictionsimposedbyrevealedpreferencetheory,andofcounterfactualsfor
eachcase.Nevertheless,sincemanysocialexperimentsareconcernedwithwelfare
andotherprogramsthatprovideexplicitvariationinemploymentincentivesand
henceusefulinformationonthelikelihoodofcounterfactualoutcomes,itisusefulto
considertherolethattheorycanplayinprovidingboundsontreatmenteffectson
endogenousoutcomes.36
ii. Addressingtheissueexantethroughthedesignoftheexperiment
36SimilarapproacheshavebeenpursuedinBlundell,Bozio,andLaroque(2011).
73
Theendogenousoutcomeproblemisofteneasilyanticipatedwhendesigning
anexperiment,asitariseswheneveroutcomeslikewagesorhoursareofinterest
andnon-employmentisarealisticpossibility.Therearevariouswaystoadjustthe
experimentaldesigntofacilitateanalysisofpotentialsampleselectionbias.For
example,supposeinthecaseoftheeffectofatrainingprogramonwagesthe
researcherbelievesthatthereareexogenousfactorsdeterminingaworker’slabor
supplydecision.Ifthesefactorscanbemeasuredexante,therandomizationcould
bestratifiedbythelikelihoodofemploymentaspredictedbytheexogenous
instruments.Stratificationwouldensuresufficientsamplesizesineachexogenous
laborsupplytier.(Ifonlyavailableexpost,say,inafollow-upsurvey,evenabsent
stratificationsuchvariablescanbestillusedasinstrumentsforparticipationif
samplesizesaresufficientlylarge.)
However,asitisusuallydifficulttocomebygoodinstrumentalvariables,the
realpowerofawell-designedRCTwouldbetomanipulatesampleselectiondirectly.
Inthetrainingexample,thiswouldentailaddingasecondsourceofrandomization
thatexplicitlymodifiestheincentivetowork(orthelikelihoodoffindingajob)but
doesnototherwiseaffecttheendogenousoutcome.Whetherthisisfeasibledepends
onthecontext.However,samplesizeconsiderationsneednotbeahurdletoadding
asecondtreatment,sincewithcross-classifiedtreatmentstheadditionofasecond
treatmenthaslittleeffectonthepowerforanalyzingtheeffectsofthefirstin
isolation.Thisapproachisparticularlyusefulifoneisinterestedinexternalvalidity,
sincethetwo-dimensionalexperimentalvariationmayallowonetotraceoutthe
74
treatmenteffectoftrainingforsub-populationswithdifferentemployment
probabilities.
Inthecaseofnon-randomattrition,aversionofthisapproachwouldbeto
randomlyselectagroupofparticipantstofollowupmoreintensively,perhaps
stratifiedwithingroupswithdifferentex-anteattritionprobabilities.Thecontrast
betweenmeanoutcomesinthissubgroupandforotherparticipants(again,perhaps
withinstrata)identifiestheselectivityofattrition,andcanbeusedtoadjustthefull-
sampleestimatedtreatmenteffects.Thisistheapproachpursuedinthefollow-up
wavesoftheMovingToOpportunityexperiment(e.g.,Kling,Liebman,andKatz
2007).Anothersolutionworthpursuingistoobtainadministrativedataforthe
universeofinitialparticipants,includingthosewhohavefailedtorespondtofollow-
upsurveys.Althoughthesedatacanalsobeselected–theytypicallydonotinclude
earningsfrominformaljobs–theselectionisdifferentfromthatcreatedbysurvey
attrition,sothecombinationofsourcescanbevaluable(thoughsometimes
confusing,asintheJobCorpsevaluationdiscussedabove).Sincemergesto
administrativedatacanusuallyonlyconductedonlywithidentifyinginformation
fromthesurveyandpermissionfromparticipants,itisagoodideatofactorthe
needforadditionaldataintotheinitialresearchdesign.
c. Siteandgroupeffects
Inmanycasesanessentialproblemistoidentifythesubpopulationsthat
benefitmostfromaprogram,soastotargetthemfortreatment.However,thereare
oftenmanypossiblesubgroupstoexamine.Whenmanycomparisonsareestimated,
75
thechanceofafalsediscovery–atreatment-controlcontrastthatisstatistically
significant,eventhoughthetruetreatmenteffectiszero–risestowardone.
Avoidingincorrectinferencesinsuchasettingrequirescare.
Aversionofthesubgroupeffectsproblemistoidentifyvariationin
treatmenteffectsacrossprogramlocationsorsites.Suchvariationmightarisefrom
observedlocalcharacteristics–e.g.,treatmenteffectsoftrainingorjobsearch
experimentsmaydependonthetightnessofthelocallabormarket.Wherethe
relevantcharacteristicsofthelabormarketareclearexanteandtheirdimensionis
limited,thisisrelativelystraightforward.Butiftherelevantdimensionsarenot
clearorthenumberofpotentialcontrastsislarge,themultiplecomparisons
problembecomesrelevant.Alternatively,theremightbeunintendedvariationin
treatmentintensityorinthefidelityoreffectivenessoftreatmentdeliveryamong
treatmentsites.Suchsiteeffectsrendertheinterpretationoftheestimated
treatmenteffectoftheoveralltreatmentdifficultandlimitexternalvalidity.Ifthey
arepotentiallyimportant,weneedestimatesofeachsite’sseparateeffect.This
impliesthatthereareasmanytreatmenteffectstobeestimatedastherearesitesat
whichtheexperimentisimplemented.
Aconceptualissueinevaluatingthesuccessofsocialexperimentswithsite
variationistodecidewhethertheparameterofinterestistheeffectoftheprogram
initsmostsuccessfulvariants,withstronglocalpartnersandappropriatelocal
conditions,ortheaverageeffectacrossarangeoflocalcircumstances.Whenthe
latterisofinterest,theidealexperimentaldesignwouldinvolvedrawing
participantsfromallsites.Butthisisoftenimpractical.Morecommonly,social
76
experimentshavebeencarriedoutatoneorafewsites.Theseareoftenchosen
becausethelocalmanagementiswillingtoparticipate,orbecausetheyareseenas
exemplarsoftheprogram.Thismakesitdifficulttointerprettheexperimental
resultsasrepresentativeoftheprogramasawhole(see,e.g.,Hotz1992andAlcott
2015),butmaycomeclosertoidentifyingtheprogrameffectunderclose-to-ideal
circumstances.37
i. Addressingtheissueexpost
Onitsface,itisstraightforwardtoestimateheterogeneityoftreatment
effectsalongobserveddimensions(e.g.,race,gender,orpastworkexperience)
usingdatafromanalready-completedrandomizedtrial:Onesimplyconstructs
treatment-controlcontrastsseparatelyforeachsubgroup.Manyauthorsemphasize
theimportanceofconductingtherandomizationseparatelyforeachsubgroupof
interest.Thisisnotinprinciplenecessary–unconditionalrandomassignment
ensuresthatassignmentisrandomconditionalonpredeterminedcharacteristicsas
well–butcanaddpowerforsubgroupcomparisons,especiallyinsmallersamples.
Amoreimportantissueisthepotentialnumberofcomparisonstobe
estimated.Ifenoughsubgroupestimatesarecomputed,evenaprogramthathasno
effectonanyonewillbelikelytoshowastatisticallysignificanteffectforsome
subgroup.(Asimilarproblemariseswhenconsideringeffectsonmultiple
37Arelatedbutdistinctproblemisthequestionofensuring“fidelityofimplementation”inanRCT–aclosealignmentbetweentheprogram’sintendeddesignandtheservicesthatareactuallydelivered.Whilethisisimportantformaximizingthestatisticalpoweroftheexperimentandfortestingwhethertheprogram’stheoryofactioniscorrect,itlimitstheexternalvalidityforuseinmakingjudgmentsaboutthelikelyoverallimpactofreal-worldprograms,whichmaynotbeimplementedwithhighfidelity.
77
outcomes.)Researchershavetakenanumberofapproachestothismultiple
comparisonsproblem.Oneistospecifythesubgroupsthatwillbeconsidered,and
thehypothesesofinterest,beforeanalyzingthedata.Thiscanlimitthescopefor
unconsciousdatamining.Italsoensuresthatthenumberofcomparisonsthatwere
consideredisknown,sothatthep-valuesofsimpletreatment-controlcontrastscan
beadjustedforthemultiplicityofthecomparisonsbeingestimated.Anappropriate
adjustmentmakesitpossibletoobtainaccuratep-valuesforthetestofwhetherthe
programhadanyeffectonanysubgroup.Buttwoissuesremain:Thesetests
typicallyhaveverylowpower.Inaddition,evenwhentheydorejecttheyareoften
notabletoidentifywhichsubgroupshavenon-zerotreatmenteffects.Afull
discussionofadjustmentformultiplecomparisonsisbeyondthescopeofthis
chapter,butAnderson(2008)isausefulreference.
Multiplecomparisonsapproachescanbeusefulaswellfortheanalysisof
treatmenteffectsbysiteand/orprovider.Butthequestionsofinterestregarding
siteeffectsarenotgenerallywhethereachsite’seffectisorisnotdifferentfrom
zero,whichiswhatmultiplecomparisonsadjustmentsaredesignedtoanswer,but
ratherthemagnitudeandcorrelatesofvariationintreatmenteffectsacrosssites.
Moreover,thefactthatthesite-specifictreatmenteffectscaninsomesensebeseen
asdrawsfromalargerdistributionopensupnewoptionsforanalysisthatarenot
availableintraditionalstudiesofsubgrouptreatmenteffects.
78
Themid-1990sNationalJobCorpsStudy,discussedabove,illustratessomeof
theissuesinvolved.38Asmentionedpreviously,therandom-assignmentstudy
indicatedthattheprogramhasapositiveaverageeffectonearningsfouryearsafter
participation,ofamagnituderoughlycomparabletothereturntoafullyearof
education(Schochet,Burghardt,andMcConnell2008).(Atthetimeofthe
evaluation,theaverageparticipantwasenrolledforabouteightmonths.)
Butlikeotherjobtrainingprograms,thespecific“treatment”providedtoJob
Corpsparticipantsvariessubstantiallyacrossindividuals,accordingtoperceived
needs.Moreover,JobCorpsservicesaredeliveredat110mostlyresidentialcenters,
themajorityofwhichareoperatedbyprivatecontractors.Someprovidersmaybe
betteratdeliveringaneffectiveprogram(oratguidingparticipantstothetypesof
servicesthattheyneed)thanareothers.Thecenter-specifictreatmenteffectsare
thusofgreatinterest.
TheDepartmentofLabor(DOL)haslongusedaperformancemeasurement
systemtotrackperformanceofthedifferentcentersandinformdecisionsabout
contractrenewal.Performancemeasuresarenon-experimental,andinclude
statisticsliketheGEDattainmentrateoraveragefull-timeemploymentrateof
programparticipantsateachcenter.Butitisnotclearthattheseperformance
indicatorssuccessfullydistinguishcenterimpactsfromdifferencesinthe
populationsservedbythevariouscenters.
38OtherstudiesthatexaminesimilarquestionsareBloom,Hill,andRiccio(2005)andBarnow(2000).Seealsoourdiscussionoftreatmentspilloversabove.
79
SchochetandBurghardt(2008;hereafter“SB”)attempttousetherandom-
assignmentJobCorpsStudytovalidateDOL’sperformanceindicators(seealso
Barnow,2000,whocarriesoutasimilarexerciseforJTPA).Inprinciple,estimation
ofsite-levelcausaleffectsusingtheexperimentisstraightforward:Onesimply
comparesmeanoutcomesofthetreatmentandcontrolgroupsateachsite,relying
ontheoverallrandomassignmenttoensurebalanceofeachsite-levelcomparison.
Butafewchallengesarise.
First,intheJobCorpsStudyrandomizationtookplacebeforeapplicantswere
assignedtocenters.Thus,treatedindividualsareassociatedwithcenters,but
controlindividualsarenot.SBaddressthisbyusingintakecounselors’assessments
ofthecenterthattheapplicantwouldmostlikelyattend,collectedpriorto
randomization.Toensurethattreatmentandcontrolindividualsaretreated
comparably,theyusethispredictionforbothgroups,evenwhenitdiffersfromthe
actualtreatmentassignment.Differencesoccurredforonly7percentoftreatment
groupenrollees,largelybecauseparticipantstendtoenrollintheclosestcenteror
inonethatoffersaparticularvocationalprogram.
Second,evenalargeRCTsample–theJobCorpsStudyincludedover15,000
participants–canhaveverysmallsamplesizesattheindividualsitelevel.Rather
thanestimatecenter-specifictreatmenteffects,SBdividecentersintothreegroups
basedontheirnon-experimentalperformancemeasuresandestimatemean
treatmenteffectsforeachgroup.Interestingly,theyfindthatmeanprogramimpacts
donotdiffersignificantlyacrossgroups,suggestingthattheperformance
measurementsystemisnotsuccessfullyidentifyingvariationincenters’causal
80
impacts.ArelatedexerciseiscarriedoutbyBloom,Hill,andRiccio(2005),whofirst
estimatestatisticallysignificantvariationintreatmenteffectsacross59localoffices
thatparticipatedinthreewelfare-to-workexperiments,thenuseamulti-level
modeltoestimatetherelationshipbetweenofficecharacteristics–mostlyhavingto
dowiththewaythatthetreatmentwasimplementedineachsite,thoughtheyalso
includethelocalunemploymentrate–andoffice-leveltreatmenteffects.Incontrast
totheJobCorpsstudy,theydofindsignificantassociationsofthetreatmenteffect
withboththeirimplementationmeasuresandthelocalunemploymentrate.
Bloom,Hill,andRiccio’s(2005)interestisinidentifyingwhichprogram
featuresaremosteffective.Itisimportanttoemphasize,however,thatthe
associationbetweensite-levelcharacteristicsXjandthesite-specifictreatment
effectτjisobservational,notexperimental,anddoesnotbearastrongcausal
interpretation.Itisquitepossiblethatwhatappears,forexample,tobeastrong
associationbetweentheemphasisthatsitesplaceonquickjobplacementandthe
site-leveltreatmenteffectinsteadreflectsanon-randomdistributionofthis
emphasisacrosssitesthatvaryinotherimportantways.
LiketheJobCorpsstudy,Bloometal.(2005)donotinvestigatevariationin
siteimpactsconditionalonXj.Inmanysettings,thatvariationmightbeof
substantialinterest.Onemightlike,forexample,toestimateeffectsofindividual
sites,ortoaskwhichofanumberofavailableperformancemeasuresdothebestjob
ofpredictingexperimentalimpacts.Thelatterquestionisanaturalonetoask
regardingtheJobCorpsStudy,buttoourknowledgeithasnotbeenpursuedwith
81
experimentaldata(thoughseeBarnesetal.2014forarelatedinvestigationusing
non-experimentaldata).
Muchworkontheestimationofsiteeffectsthemselvescomesoutofefforts
tomeasurehospital,school,orteacherperformance(see,e.g.,Jackson,Rockoff,and
Staiger2014andRothstein2010).Thesestudiesareprogramevaluations,treating
eachsiteorteacherasadistinct“program,”butcannotrelyonrandomassignment
toidentifyprogrameffects.AsintheJobCorpsStudy,therearemanysitesbut
samplesarefrequentlysmallatthesitelevel,so–evenifselectionbiasesareset
aside–site-specifictreatmenteffectestimatesarequitenoisy.Oneconsequenceis
thatactualtreatmenteffectswilltypicallybeclosertotheaveragethanare
estimatedeffects,evenwhentheresearchdesignpermitsunbiasedestimationof
eacheffect.Thus,itiscommonintheseliteraturesto“shrink”theestimated
treatmenteffectstowardthemean.Theproceduregoesbymanydifferentnames–
e.g.,shrinkage,EmpiricalBayes,regularization,partialpooling,multi-levelmodeling
–butthebasicideaisthattheposteriorestimateofasite’seffectequalsaweighted
averageoftheunbiasedestimateofthatsite’seffectandthemeansiteeffect,with
weightsthatdependontheprecisionofthesiteestimate.
Letτjrepresenttheimpactoftheprogramatsitej,andsupposethatacross
sites,τj~N(!,ω2).Supposethatwehaveanoisybutunbiasedestimateofthesitej
effect:tj|τj~N(τj,σ2).Thentheformercanbetreatedasapriordistributionforτj.
ByBayes’Rule,theposteriormeanofτjgiventheobservedestimateis
E[τj|tj]=!+f(tj–!),
where
82
f=ω2/(ω2+σ2)
isthereliabilityratioofthesite-specificeffectestimate.39
Whenthetreatmenteffectvariessystematicallywithsite-levelcovariates–
characteristicseitherofthetreatmentorofthecounterfactual–thiscanbeusedto
improveprecision.Ifthesiteeffectsaremodeledasafunctionofsitecharacteristics,
τj=Xjβ+νj,withνj~N(0,σν2),thenthenoisysite-levelestimatetjshouldbe
shrunkentowardtheconditionalmeanratherthantothegrandmean:
E[τj|tj,Xj]=Xjβ+f’(tj–Xjβ),
wheref’istheconditionalreliabilityratio,f=ω2/(ω2+σν2).Thisissometimes
knowninthestatisticsliteratureas“partialpooling.”
OneuseoftheshrinkageapproachisbyKaneandStaiger(2008),whousea
random-assignmentexperimenttovalidatenon-experimentalestimatesofteachers’
treatmenteffectsontheirstudents.Theyshrinkthenon-experimentalestimates,
undertheassumptionthattheseestimatesarevalid,andaskwhethertheresultis
anunbiasedpredictorofateacher’streatmenteffectsunderrandomassignment.
KaneandStaigerfocuson“value-added”scores,estimatesofteachers’effects
ontheirstudents’testscoresfromobservationalregressions,asthesolenon-
experimentalestimate.Theyfailtorejectthehypothesisthatthesescoresare
unbiasedpredictorsoftheexperimentaleffects,consistentwiththeviewthatthey
areunconfoundedbystudentsorting.Buttheexperimenthasquitelowpowerto
39TheposteriormeanisalsoknownasanEmpiricalBayesestimate.Itisanunbiasedpredictorofthetruesite-leveltreatmenteffectτjifthesite-specificestimatestiareunbiasedestimates(Rothstein2016).
83
distinguishalternativeexplanations,andRothstein(2016)arguesthatthequestion
remainsunresolved.40
Angristetal.(2015)exploretheoptimalcombinationofexperimental
estimateswithpotentiallybiasedbutmoreprecisenon-experimentalestimatesto
obtainminimummean-squared-errorpredictionsofschools’treatmenteffects.A
relatedquestioniswhethernon-experimentalmeasuresofotherparameters(e.g.,
classroomobservations)canimprovethepredictionofexperimentaleffects.Ifso,
onemightwanttouseaweightedaverageoftheavailablemeasures,weightedto
bestpredicttheexperimentaltreatmenteffect,forperformancemeasurement
purposes.Toourknowledge,nostudyhasattemptedtoestimatetheseweightsinan
experimentalsetting(thoughseeMihalyetal.,2013,foranon-experimental
analysis).
ii. Addressingtheissueexantethroughthedesignoftheexperiment
Ultimately,smallsamplesizeshavelimitedanalysts’abilitytoidentifysite-
orgroup-levelvariationintreatmenteffects.Buttheremaybewaystodesign
experimentstobettersupporttheseinvestigations.Mostobviously,resourcescan
beputintocollectingdataonvariationinthequantityandtypesoftreatments
delivered,tosupportanalyses(likethatofSchochetandBurghardt2008orBloom
etal.2005)ofhowsitetreatmenteffectsvarywithobservablemeasuresofsite
treatmentvariation.Large-scaleprogramevaluationsoftenincludeimplementation
analysesalongsiderandomizedimpactevaluations,andifthesetwoportionswere
40Formoreonthetopicofteachervalue-added,seeChetty,Friedman,andRockoff(2014)andRothstein(2016).
84
closelyintegratedtheresultsoftheimplementationstudycouldbeusedtoinform
ananalysisofsiteeffectsintheimpactevaluationsample.Powercanalsobe
improvedbyconductingrandomizationwithinsite-levelstrataandbyminimizing
non-compliancerates(andcarefullymeasuringtreatmentsactuallyreceived).
d. Treatmenteffectheterogeneityandexternalvalidity
Theempiricalliteratureonprogramevaluationhasbeenincreasinglyaware
oftheimportanceofpotentialheterogeneityintreatmenteffectsforinterpreting
estimatesofprogramimpactsandassessingtheirexternalvalidity.Manyevaluation
samplesaredrawnfromspecificpopulations–individualsinparticularregionsor
cities,individualsenteringaprograminacertainway,orindividualsthought
suitableforaproposedalternativeprogram.Iftreatmenteffectsvary,generalizing
fromthesesamplestoabroaderpopulationishazardous.Anothervariantofthe
externalvalidityproblemariseswhenthecompliancerateintheexperimental
samplediffersfromwhatwouldbeexpectedoutsidetheexperiment,asthe
experimentalLATEmaynotcorrespondtoanappropriatecomplierpopulationfor
theprogramevaluationofinterest.
Thereareseveralpotentialsourcesofheterogeneity.Intheprevioussection,
wehavediscusseddifferencesincharacteristicsoftheenvironment(suchasstateof
thelabormarket,includingbusinesscycleandindustryoroccupationstructure,
populationdensity,orlabormarketdiscrimination),differencesinaspectsofthe
program(suchasunintendeddifferencesintheintensityoftreatment,something
weaddressundersiteeffects).Inthissection,wefocusonthecasewheretreatment
85
effectsvarybecauseofdifferencesincharacteristicsattheindividuallevel(suchas
preferences,abilities,health,beliefs,resources,familyenvironment,oraccessto
networks).BelowandinSectionIV.f,wealsodiscussvariationtreatmenteffects
arisingbecauseofvariationinstructuralaspectsoftheprogram,suchasdifferences
inworkincentives.
i. Addressingtheissueexpost
Theliteratureisbroadlyinagreementonhowtodealwithheterogeneityin
treatmenteffectsbyobservablecharacteristicsofstudyparticipants.Asdiscussedin
SectionIV.c,theexperimentaldesignimpliesthatonecanobtainconsistent
estimatesofthetreatmentimpactforeachsubgroup,subjecttohavingsufficiently
largesamplesizes.OnecanthenextrapolatetheTOTandATEtosettingswithother
distributionsofobservablecharacteristicsbyconstructingappropriatelyweighted
averagesofsubgroupeffectsandcorrespondingstandarderrors.Asamore
commonalternative,onecandirectlyestimateTOTandATEforanotherpopulation
byreweightingtheoriginalsampletomatchthedistributionofobservable
characteristicsofthetargetpopulation(e.g.,DiNardo,Fortin,andLemieux1996).If
multipletreatmentsitesareavailable,inprincipleasimilarapproachcanbeusedto
assesstheeffectofenvironmentalcharacteristics,suchaslabormarketconditions
orindustrialstructure.
Thecaseofheterogeneitybyunobservedcharacteristicshaspresented
greaterchallenges.Unfortunately,theindividual-leveltreatmenteffectisgenerally
notidentifiedeitherbyexperimentalnornon-experimentalmethods.Evenwith
perfectcompliance,anexperimentidentifiesonlytheaveragetreatmenteffect
86
conditionalonobservedcharacteristics.
Somearguethataveragetreatmenteffectsaresufficientformostpurposes,
aswecareonlyaboutthedistributionsofoutcomesunderalternativepoliciesand
notaboutthepositionsofparticularindividualswithinthosedistributions.Thisisa
controversialclaim,however–inmanycontexts,aprogramthathelpedsome
individualsbuthurtothersbyanequalamount,withzeroaverageeffect,wouldbe
judgedworsethannothing.
Moreover,averageeffectsmaynotbegeneralizablebeyondthepopulation
(withperfectcompliance,experimentalparticipants,orwithimperfectcompliance,
thesubgroupofcompliers)identifiedbyanexperiment.Withheterogeneous
treatmenteffects,neithertheTOTnorthecomplierLATEmayberelevantforother
populationsofinterest.Akeyquestionthenishowrepresentativetheexperimental
compliersareofthegroupofpeoplethatwouldbepotentiallyaffectedbythe
programinquestion.Inmanycasestheprogramcompliersarelikelytobesimilarto
thepopulationofinterest,inwhichcasethecomplierLATEislikelytoapproximate
therelevantparameter.Inothercases–forexamplewhencomplianceislikelyto
differbetweenthestudyandtheprogramatscale–theestimatedLATEfromone
programevaluationmaybelessuseful.
HeckmanandVytlacil(2005)proposeaconceptualframeworktoanalyze
heterogeneityintreatmenteffectsthatreliesontheconceptofthemarginal
treatmenteffect(MTE).Ifτidenotestheindividualtreatmenteffect,Xiisavectorof
observedindividualcharacteristics,andviistheerrorintheequationdetermining
takeupoftreatment,thenthemarginaltreatmenteffectisdefinedasE[τi|Xi=x,
87
vi=v];ofinterestishowthisvarieswithv.Thisstructureprovidesaframeworkfor
consideringexternalvalidity.ThetraditionalLATEobtainedfromanalysesof
experimentswithnoncompliancecanbeseenastheintegraloftheMTEovera
particularrangeofv,butproposalstoexpandorrollbackprogramsmayimplicate
MTEsatothervvalues.
TomovebeyondtheLATE,werequireamulti-valuedinstrumentthatcan
mapoutthefulldistributionofv(or,equivalently,thefullrangeofPr(T=1|X)).If
suchaninstrumentisavailable,theMTEcanbeobtainedbyanon-parametric
regressionoftheoutcomeonthefittedprobabilityofprogramparticipation
resultingfromthefirststageequation.41
ThisisnotpossibleinthecaseofasimpleRCT.However,whentheRCT
implementedatmultiplesites,andifoneiswillingtoassumethatheterogeneityof
siteeffectsislimitedtocompliancerateswithnovariationineffectsontheoutcome,
onecanexaminetherelationshipbetweenthesite-specificcompliancerateandthe
site-specificestimatedtreatmenteffect(i.e.,thesite-specificLATE).42(Alternatively,
onecoulddirectlyregressthesite-specifictreatmenteffectontheestimated
probabilityoftakeupandobtaintheMTEfordifferentcompliancerates.)This
relationshipcouldinprinciplebeusedtoforecastthelocalaveragetreatmenteffect
41Manyotherrelevantparameters,includingLATEandATE,canbeexpressedasfunctionsoftheMTE.However,toestimatetheATEortheTOT,say,oneneedstoobtaintheMTEforeachvalueofXforthefullrangeofcomplierprobabilities,i.e.,from0to1.Whileinmanycasesthismaybeinfeasibleduetodatalimitations,ifavailablethiscouldbeusedtoextrapolatetheATEorTOTforpopulationswithdifferentcomplianceratesanddistributionofcharacteristics.42NotethattheweightingfunctionoftheLATEestimatorformulti-valuedinstrumentsinAngristandImbens(1995)isproportionaltothedifferencesintakeupprobabilitiesbetweendifferentvaluesoftheinstrument(orderedbythevalues’impactontakeup).Thisdifferencecanbeinterpretedasthedifferenceincompliancebetweeninstrumentvalues.
88
atapotentialalternativetreatmentsite(possiblyreweightingtoadjustfor
differencesinobservablecharacteristics),givenaforecastofthenewsite’s
compliancerate.Moregenerally,thisapproachwouldallowinferringtheeffectof
anyinterventionaffectingthecostofcomplianceandhencethecompliancerate
itself.
Attimesitisusefultogofurther,toestimatingthefulldistributionof
treatmenteffects.Theabovemethodwillnotaccomplishthis.Heckman,Smith,and
Clements(1997)showthatwithoutadditionalassumptions,experimentaldatais
essentiallyuninformativeaboutthetreatmenteffectsdistribution.Moreover,they
demonstratethatquitestrongassumptionsonthedependenceofcounterfactual
outcomesinthecontrolandtreatmentstatesareneededtoobtainplausible
estimatesofthedistributionoftheeffectoftraininginthecontextoftheNational
JobTrainingPartnershipAct(JTPA)study.Nevertheless,asmentionedattheoutset,
knowledgeofthedistributionofheterogeneoustreatmenteffectsisundoubtedly
importantinassessingtheimpactofaparticularprogram.(thoughitisless
straightforwardhowsuchinformationcanbeusedtoaddresstheissueofexternal
validityiftreatmenteffectsvarypurelywithunobservedcharacteristics).
Oneapproachthathasbeenusedtomakeinferencesaboutheterogeneityin
treatmenteffectsisestimationofquantiletreatmenteffects(QTE).Asdiscussedin
SectionIV.a,theQTEfortheq-thquantileisdefinedasthedifferenceintheq-th
quantileoftheoutcomedistributioninthetreatmentandcontrolgroups,
respectively.Itisclearthatabsentstrongassumptions,suchasrankstability,QTEs
donotrecoverthedistributionoftreatmenteffects(thoughtheydorecoverthe
89
effectofthetreatmentontheoutcomedistribution,whichmaybesufficientfor
manypurposes;seeAtheyandImbens,thisvolume).Yet,itcanbeahelpfuland
easy-to-implementdiagnosticdeviceinatleasttwosenses.First,aQTEanalysiscan
beusedtotesttheassumptionofconstanttreatmenteffects,whichwouldimplythat
theQTEisequalatallquantiles.Second,insomecasesparticularfeaturesofa
programallowonetoderivepredictionsastoresponsesindifferentquantilesofthe
outcomedistribution(seebelow).Moregenerally,QTEmayprovideabroad
descriptivesenseofpotentialtreatmentresponses.
Onesourceoftreatmenteffectheterogeneityisdifferencesinthestructureof
theprogramtobeevaluated.Inthiscase,theorymayprovideweakassumptions
thatallowmakinginferenceonthedistributionoftreatmenteffects.Welfare
programsrepresentagoodexample,sincetheyusuallycombinearangeofdifferent
laborsupplyincentivesarisingamongothersfromwelfarepayments,earnings
disregards,implicittaxrates,orphase-outregions.Clearly,theseincentivesinteract
locallywithindividualheterogeneityinpreferencesorability,somethingwewill
returntobelow.Buttheadditionalstructurecanmakeformorenaturalidentifying
restrictionsthaninthecaseofaprogramthatisatleastintendedtobeuniform,
suchasatrainingcourse.Aseriesofpapershasaddressedthisquestioninthe
contextofevaluationofConnecticut’swelfare-to-workprogram,JobsFirst,against
thethen-prevailingalternativewelfareprogram.Forexample,toassessthedegree
ofheterogeneityintreatmentresponsesBitler,Gelbach,andHoynes(2006)
implementaQTEanalysisasdescribedabove,andrelatetheresultingestimatesto
predictionfromastandardlaborsupplymodel.TheKlineandTartari(2016)study
90
discussedabove,aimedatboundingtransitionprobabilitiesbetweencounterfactual
states,takesadvantageofacross-participantobservabledifferencesinthenatureof
thedecisionproblemfacedtoconstructrevealed-preferencerestrictionsontheset
ofpotentialtransitions.Thisisanimportantdiagnosticdeviceforassessingthe
rangeofcounterfactualtreatmentresponsestotheprogramitself.Asdiscussed
above,apotentialdrawbackisthattheprocedureisrathercomplexandonlyapplies
totheparticularprogramstudied.Onealsohastocontendwithpossiblywide
bounds.
Inprinciple,KlineandTartari’sapproachcanalsobeusedforpredictingthe
effectonthedistributionofmarginaloutcomesofmovingfromtraditionalwelfare
toawelfare-to-workprogramofthesamestructureatanothersite(seeSection
III.a).Yet,itisworthkeepinginmindthattheestimatedboundshavetheLATE
property,i.e.,theymaydependontheparticulardistributionofindividual
characteristicsandthelocalenvironment.Extrapolatingtodifferentpopulationsor
environmentsintheircontextwouldrequireimposingadditionalassumptionson
theunderlyingstaticlaborsupplymodel,andthustradeoffadditionalpredictions
withrobustness.
ii. Addressingtheissueexantethroughthedesignoftheexperiment
Theremaybeanopportunitytomakemoreprogressonthistypeof
treatmenteffectheterogeneitybybuildingitintotherandomizationdesign.Cross-
classifiedandmultipletreatmentgroupexperimentscanbequitehelpfulfor
identifyingvariationintreatmenteffects.
Insomecases,wearedirectlyinterestedinunderstandingthedistributionof
91
treatmenteffects.Whenaplausiblestructuralmodel(perhapssomethingassimple
asaHeckman-Vytlacil(2005)Roymodel)isavailable,onemightusethestructural
modeltopredictindividualtreatmenteffects,thenstratifytheexperimentbasedon
thesepredictions.TheNITstudiescanbeseenasaversionofthis,asthesewere
stratifiedbasedonpriorearnings,apotentiallystrongpredictorofthetreatment
effect.
Inothercases,concernsaboutheterogeneityaredrivenbypotential
differencesbetweenthecomplierLATEandthepopulationATE.Ratherthansimply
assigningparticipantstobeofferedornotofferedthetreatment,onemightalso
varytheextentofeffortstoenforcecompliancewiththeexperimentalassignment.
Whentherelevantselectionisthoughttobebasedinpartontheanticipated
individualtreatmenteffect,asinHeckmanandVytlacil(2005),onecanidentifythe
MTEcurvedirectlybyrandomlyassigningparticipantstomultiplevaluesofthe
incentive(orcost)toobtainthetreatment.
Whichoftheseisappropriatedependsonthenatureoftheselectioninto
complianceintheexperiment,andhowitrelatestowhatwouldbeobservedina
non-experimentalsetting.Tomakethingsconcrete,wewillconsiderastudyin
whichapplicantsarerandomlyassignedtobeeligibleorineligibletoreceive
trainingofferedataparticularjob-trainingcenter.Onemightexpectthatnon-
compliancerateswillbelowforthoseassignedtothetreatmentgroupforwhomit
isinconvenienttotraveltotheprogramsite.OnemightthenexpecttheLATEto
varywithtravelcosts,butinasimpleexperimentthereisnowaytoestimatehow
muchofthisisduetodifferencesinaveragetreatmenteffectsbetweenthosewho
92
liveclosetoandfarawayfromtheprogramsiteandhowmuchtodifferencesin
selectionintothecompliergroup.
Onewaytolearnaboutthiswouldbetoimplementamorecomplex,multiple
treatmentarmexperimentinwhichasubsetofindividualsofferedaccesstothe
trainingarealsoofferedtransportationtothetrainingsite.Ifthedistance-treatment
effectcurvesdifferbetweenthetwotreatmentarms,onecanconcludethatselection
intoparticipationisimportant,andthiscanthenbeused(withaparametric
selectionmodel)toestimatehowtheLATEforasimilarly-selectedcomplier
populationvarieswithdistance.Thismaybeimportantifthegoalistogeneralize
fromtheexperimenttoascaled-upprogramthatwouldoffertrainingatawider
numberofsites.
Onecanalsousethethree-armexperimenttoidentifytheMTEcurve,but
onlywithstrongrestrictionsontheshapeofthiscurve(whichcorrespondtostrong
parametricassumptionsabouttheselectionprocess;seeBrinch,Mogstad,and
Wiswallforthcoming).Theserestrictionsmaybeunattractive.Ifanimportantgoal
ofthestudyistounderstandhowtreatmenteffectsvarywiththecostsof
participation,anevenmorecomplexexperimentaldesignmightbecalledfor.
Ratherthanassigningindividualstoatreatmentgroupthatreceivestrainingatzero
costoracontrolgroupthatisdeniedaccesstotrainingatanyprice,onemightuse
multiplegroupsthatareofferedtrainingatdifferentpricepoints(including
potentiallynegativeprices).Variationinoutcomesacrossthesegroupswilltrace
outseveralpointsontheMTEcurveandcanbeusedtoidentifyamoreflexibly
shapedcurveunderweakerassumptions.
93
Cross-classifiedandmultipletreatmentarmexperimentsraiseanumberof
practicalissuesthatarenotconfrontedinclassicaltreatment/controlstudies.First,
allocatingobservationsacrossmanyarmsreducespowertodetectdifferencesin
outcomesbetweenanypairoftreatments.Researchersdesigningexperimentsmust
thereforetradeoffthebenefitsofamultiple-treatment-armexperimentagainst
reducedabilitytodetectparticularpairwisecontrasts.Thisissuecansometimesbe
addressed,however,whenthealternativearmscanbeseenasvaryingthedosageof
asinglewell-definedtreatment.Anexperimentwherealltreatedindividualsare
assignedatreatmentdoseof1giveslesspowerforidentifyingalineardose-
responserelationshipthanonewherethesameindividualsareassignedvarying
doseswithameanof1(forexample,whenhalfareassignedadoseof0.5andhalf
areassigned1.5);moreover,thelatterdesignprovidesatleastthechanceof
detectingnonlineareffects.
Cross-classifiedexperiments,withafractionpassignedtotreatmentAanda
fractionqindependentlyassignedtotreatmentB,canalsobeseenassacrificing
power,thoughagaintherealityismorecomplex.Letyabirepresentthepotential
outcomeforindividualiwhentheprogramAassignmentisa(a=0or1)andthe
programBassignmentisb.ThetraditionalestimandforevaluationofprogramAis
E[y10i–y00i].Only(1-q)NoftheNobservationsinthecross-classifiedexperiment
canbeusedforestimatingthisquantity,astheotherqNobservationsareassigned
toreceivetreatmentB.Buttheexperimenthasfullpowerforestimatingthe
alternativetreatmenteffectE[((1-q)y10i+qy11i)–((1-q)y00i+qy01i)].Thiscanbe
seenasaweightedaverageoftwotreatmenteffectsofprogramA,onethatapplies
94
toindividualswhoalsoreceiveprogramBandoneforthosewhodonot.Insome
cases,thismaybeofmoreinterestthanthetraditionalestimand–e.g.,whenthe
scaled-upversionofprogramAwillcoexistwithprogramB.
e. Hiddentreatments
Along-standingissueintheinterpretationofjobtrainingprogram
evaluationsisthattheseevaluationscommonlyhavesubstantialratesofnon-
complianceandcrossovers.Manypeopleassignedtoreceivetrainingdonot
completetheircourses,andithasbeenoperationallyandpoliticallydifficultto
excludepeopleassignedtothecontrolgroupfromreceivingtreatment,eitherfrom
thesameproviderthatservesthetreatmentgrouporfromanalternativeprovider.
Indeed,insomecases,ethicalconcernsledtodecisionstoactivelyinformcontrol
groupindividualsaboutalternativesourcesoftraining.
Muchoftheliteraturetreatsthisasnon-complianceofthetypediscussedin
SectionII.b.ii,soestimatesthetrainingeffectbydividingtheITTeffectbyan
estimateofthecompliershare(see,e.g.,Heckman,Hohmann,Smith,andKoo,2000).
Butthisisunsatisfactorywhenthecontrolgroupnon-compliersreceiveadifferent
treatment–e.g.,trainingfromadifferentprovider–fromthatgiventothe
treatmentgroup.Intechnicalterms,thisisaviolationofSUTVA;practically,it
meansthatassignmenttotreatmentmayaffectoutcomesevenforthealways-takers
whoreceive(sometypeof)traininginanycase.Toourknowledge,thisissuehas
notbeenaddressedintheenormousliteratureonjobtrainingexperiments.
(Heckmanetal.,2000,notetheissue,buttheiranalysesfocusonnon-random
95
selectionintotrainingandheterogeneityoftrainingeffects,whicharerelatedbut
distinctissues.)
EventheIVapproach,unsatisfactoryasitis,isoftennotfeasible:Itrequires
measuringtheshareofthecontrolgroupthatcrossesover.Inmanycases,thisisnot
available:Theexperimentaldataincludesinformationonthereceiptofservices
fromtheprogramunderstudybutnotonservicesobtainedfromothersources.In
thiscase,onlyintention-to-treat(ITT)estimatescanbecomputed.Buttheseare
attenuatedbythefailuretomeasurethe“hidden”alternativetreatments.
i. Addressingtheissueexpost
AveryrecentliteraturetakesupthistopicinthecontextoftheHeadStart
pre-schoolprogram.TheHeadStartImpactStudyrandomlyassignedHeadStart
applicantstobeofferedcareorturnedaway.Manyofthecontrolgroupapplicants
(andasmallershareofthetreatmentgroup)woundupreceivingalternativecenter-
basedchildcarethatisthoughttobelesseffectivebutmaybeapartialsubstitute.
WheretraditionalIVestimatorstreatthisasequivalenteithertotheHeadStart
treatmentortothereceiptofnoservices,itmightbemoreappropriatetotreatitas
adistinct,“hidden”treatment.
Walters(2014)estimatesheterogeneityintheHeadStarteffectacross
centers(sites),finding(amongotherresults)thattheLATEofHeadStart
participationissmallerwhenmoreofthecompliergroupisdrawnfromother
centersratherthanhome-basedcare.Thisissuggestivethatothercenter-basedcare
isdistinctfromhome-basedcare.
96
KlineandWalters(2014)explicitlymodelthehiddenalternativecenter
treatment,usingvariationinthecompliancepatternsacrossparticipants’
observablecharacteristics(e.g.,parentaleducation)toidentifyamultinomial
variantofaHeckman(1979)parametricselectioncorrectionandthusobtain
partiallyexperimentalestimatesoftheseparateeffectsofthetwotypesofchild
care.Theirapproachleveragesvariationacrossobservablecharacteristics(X)inthe
shareofexperimentalcomplierswhoaredrawnfromalternativecentercare,
togetherwithautility-maximizingchoicemodelthatconstrainshowselectionon
unobservablesvarieswithX.Withtherestrictionsimposedbythismodel,theyfind
largeeffectsofHeadStartrelativetohome-basedcare.AstheHeadStartexperiment
didnotdirectlymanipulatethechoicebetweenhome-basedandothercentercare,
theyarenotabletoestimatetherelativeeffectofthesewithanyprecisionintheir
leastrestrictivemodel,thoughpointestimatesareconsistentwithaneffectofother
centerscomparabletothatofHeadStart.WhenKlineandWaltersimposestronger
restrictionsontheselectionprocess,theyobtainsimilarpointestimatesbutwith
moreprecision.
Felleretal.(2014)alsoexaminethehiddentreatmentsissueintheHead
StartImpactStudysample.Theyuseaprincipalpost-stratificationapproachthat,
likeKlineandWalters,exploitsvariationacrossobservablesinselectionintothe
twotreatments.Theycouplethistoafinitemixturemodelingstrategythattreats
theseparationofthetwocompliersubgroupdistributionsasadeconvolution
exercise.Parametricassumptionsaboutthesedistributionsareusedtoidentifythe
localaveragetreatmenteffectsofthetwotreatments.ResultsaresimilartoKline
97
andWalters:HeadStarthaspositiveeffectsonthosewhowouldotherwisebeat
home,butlittleeffectonthosewhowouldotherwisereceivealternativecenter-
basedcare.
AnotherexampleoftheanalysisofhiddentreatmentsisPinto’s(2015)
analysisoftheMovingtoOpportunityexperiment.Inoneview,theMTOstudy
involvedtwotreatmentarms:Oneofferedahousingvoucherthatcouldbeused
anywhere,andtheotherrestrictedthevouchertoalow-povertyneighborhood.
StraightforwardexperimentalcomparisonsidentifytheITTandLATEofusageof
eachtypeofvoucher.Inanotherview,however,therelevanttreatmentisthetypeof
neighborhoodinwhichtheparticipantlives.Kling,Liebman,andKatz(2007)use
variationacrossthetwotreatmentarmsandacrosssitestoidentifyeffectsof
neighborhoodpoverty(underrestrictionsontreatmenteffectheterogeneity).Pinto
(2015)addsmorestructure,usingrevealedpreferencerestrictions–anyoneoffered
anunrestrictedvoucherwhomovestoalow-povertyneighborhoodcanbeassumed
tochoosethesametypeofneighborhoodinthecounterfactualwhereshereceivesa
restrictedvoucher–toidentifyparametersofinterestconcerningthedistributionof
neighborhood-typetreatmenteffects.43
ii. Addressingtheissueexantethroughthedesignoftheexperiment
ThePinto(2015)studytakesadvantageofthemultiple-treatmentarmsin
theMTOexperiment,whiletheHeadStartpapersdiscussedaboveexploit,in
43Pinto’sanalysisassumesthatthesetofneighborhoodsinwhichavouchercanbeusedistheonlyrelevantdifferencebetweenthetwotreatmentarms.ButinMTOlow-povertyvoucherrecipientswerealsoofferedcounselingthatmayhavehadindependentimpactsonneighborhoodchoiceorevenonoutcomes.
98
variousways,theuseofcentersasstratainthatexperiment.Thissuggests,
correctly,thatcomplexexperimentaldesignsmaybeusefulinresolvinghidden
treatmentproblems,andthataresearcherinterestedintheseproblemsmightbe
abletodesignanexperimentwiththeminmind.Intheneighborhoodeffects
example,onemightwanttohaveseveraltreatmentarmsthatvaryinthe
restrictionstheyplaceonneighborhoodchoice;forHeadStart,onemightexplorea
thirdtreatmentarmthatprovidesavoucherusableeitherataHeadStartcenteror
atanalternativecenter.Thisdesignmightalsobeusefulforajobtraining
evaluation.
Ineachofthesecases,itiscrucialtocollectinformationaboutthetypeand
amountoftreatmentthateachparticipantactuallyreceives;withoutthis,the
complexexperimentaldesignsareoflittlevalue.
f. Mechanismsandmultipletreatments
ThehistoryinSectionIIImakesclearthatmanylabormarketexperiments
involvevariationinmorethanoneaspectofagivenprogram.Thisisclearlythecase
whenprogramsconsistingofsuitesofservicesandincentivesareevaluated,suchas
inrandomizedevaluationsofwelfare-to-workprogramsoroflarge-scaletraining
programswitharangeofintegratedservicessuchasJTPAorJobCorps.Yet,even
theinterpretationofmanyRCTsofsmallertrainingprogramsismadedifficultby
thefactthatsomeformofjobsearchassistanceisprovided.SimpleRCTsdonot
identifywhichofthecomponentsofthetreatmentareresponsiblefortheimpact.
Learningaboutsuchmechanisms,besidesbeingofinterestinitsownright,is
99
particularlydesirableifonewishestoextrapolatetonewprogramsorlearnabout
underlyingbehavioralparameters.Thisisforexamplerecognizedexplicitlyinthe
ongoingevaluationoftheREAprogramdiscussedinSectionIII,whichaims
explicitlyatdistinguishingtheeffectofa‘hassle’duetobeingsummonedtoappear
fromtheactualjobsearchassistanceprovided.
Evenwhenthetreatmenthasonlyonecomponent,inmanycasesthat
componentissufficientlycomplexthattheaveragetreatmenteffectisnotenough–
wewanttounderstandtheunderlyingmechanism.Thesimplestexampleofthisis
laborsupplyexperiments,forwhichitisoftenimportanttodistinguishincomeand
substitutioneffects.Italsoarisesinmanyofthewelfarereformprograms,which
cancreatecomplexchangesinintertemporalbudgetconstraintsduetotimelimits
oreligibilityeffects.
i. Addressingtheissueexpost
Researchershaveusedanumberofstrategiestoextractfromexperimental
dataevidenceonthemechanismsunderlyingthetreatmenteffectsidentifiedbythe
experiment.Inthesimplestcase,itissometimespossibletouseexperimental
variationtodistinguishtherelevantmechanisms,withonlyminimalrestrictions
derivedfromtheory.Thisismostfeasiblewhentheexperimentinvolvesmorethan
twogroups.Thefirstlarge-scalesocialexperiments,theNegativeIncomeTax
studies,wereusedinthisway.The“treatment”herewasataxscheduledescribed
bytwoparameters:Thetransferreceivedifearningswerezeroandthetaxrate
appliedtoanyearnings.Themainoutcomewaslaborsupply,andakeyconcernof
thesestudieswastodistinguishincomefromsubstitutioneffects.
100
Withasingletreatmentarmandasinglecontrolgroup,thiswouldnotbe
possible:Theneteffectofthetreatmentwouldbeidentified,buttherewouldbeno
wayofdistinguishingsubstitutionfromincomeeffects.(Oneexceptionwouldbeif
thetreatmentweredesignedtobeafullycompensatedchangeinthemarginaltax
rate–thiswouldhavenoincomeeffect,sothetreatmenteffectwouldequalthe
substitutioneffect.ButtheNITtreatmentswerenotdesignedthisway.)With
multipletreatmentsthatvaryboththebasetransferandthemarginaltaxrate,and
withanassumptionthatbothincomeandsubstitutioneffectsarelinearinthe
relevanttaxvariable,thetwoeffectscanbeestimatedseparately.
Toseethis,supposealaborsupplyfunctionthatrelateshoursofwork(H)to
thewagerate(w),non-laborincome(N),themarginaltaxrate(r),andotherfactors
suchaspreferencesforleisure(e):
H=f(w,N,r,e).
Forsimplicityofexposition,weassumeaconstantmarginaltaxrate,though
thisisnotcrucial(seeHausman1985).Amorerestrictiveassumptionisthatthe
individuallaborsupplyfunctionislinearandadditivelyseparableinnon-labor
incomeandthenet-of-taxhourlywage:
Hi=γi+wi(1-ri)δi+Niη.
Nowconsiderasimpleexperimentthatassignssomeindividualstoacontrol
groupwhereriandNiarenotmanipulated,andotherstoatreatmentgroupthat
receivesanadditionalbaselinetransferDandfacesanincrementtothetaxratet.
Then,adoptingtheearlierpotentialoutcomesframework,eachindividualhastwo
potentialoutcomes:
101
Hi0=γi+wi(1-ri)δi+Niηiand
Hi1=γi+wi(1-ri-t)δi+(Ni+D)ηi.
Withrandomassignment,thedifferenceinmeanlaborsupplybetween
treatmentandcontrolgroupsequals
E[Hi|Di=1]–E[Hi|Di=0]=-tE[wiδi]+DE[ηi].
Thefirsttermhererepresentssubstitutioneffects,whilethesecond
representsincomeeffects.Butthesimpleexperimentidentifiesonlythe
combinationofthem.
Fortunately,theNITstudiesinvolvedmultipletreatmentarms,withvarious
combinationsoftransfersandtaxrates.Considerasimpleextensionoftheabove
structure,withtwotreatmentgroups1and2andassociatedparameters{D1,t1}and
{D2,t2}.Noweachindividualhasthreepotentialoutcomesassociatedwith
assignmenttothecontrolgroupandeachofthetreatmentgroups,H0,H1,andH2.
Twodistincttreatment-controlcontrastscanbecomputed:
E[Hi|Di=1]–E[Hi|Di=0]=-t1E[wiδi]+D1E[ηi]and
E[Hi|Di=2]–E[Hi|Di=0]=-t2E[wiδi]+D2E[ηi].
Thisisasystemoftwolinearequationsandtwounknowns.Solongasthesystem
hasfullrank–here,aslongas(D1/D2≠t1/t2)–itcanbesolvedforthemean
incomeelasticityoflaborsupply,E[ηi],andforE[wiδi].Thelattercanbedivided
bythemeanwagerate,E[wi],toobtainawage-rate-weightedmeansubstitution
elasticity.(Withalargeenoughsample,themeansubstitutionelasticity,E[δi],could
beidentifiedbystratifyingthetreatment-controlcomparisonbythewagerate.)
102
AnumberofstudiesusedtheNITexperimentdatatoestimatethe
parametersofthelaborsupplyfunctioninbasicallythisway,accountingfor
additionalcomplicationsthatweneglecthere(e.g.,participationdecisions,non-
lineartaxschedules,etc.)andoftenusingmorecomplexlaborsupplyfunctions.See,
e.g.,Moffitt(1979).Butthiswasbynomeansuniversal:Inthelate1970s,the
experimentalparadigmwasnotaswelldeveloped,andmanyofthestudiesthat
usedtheexperimentaldatadidnotrelysolelyontherandomlyassigned
componentsofnon-laborincomeandtaxratesforidentification(e.g.,Keeleyetal.,
1978).
Intheabovesimplemodelthemeanincomeandlaborsupplyelasticitiesare
justidentifiedwithtwotreatmentarms.Withmorethantwoarms–the
Seattle/Denverexperimentalonehad11–themodelisover-identified.Thisopens
thepossibilityofperformingover-identificationtestsoftherestrictionsimposed
whenspecifyingthelaborsupplyfunction.AshenfelterandPlant(1990)estimate
separatetreatmenteffectsofeachtreatmentarm,butwearenotawareofstudies
thatinvestigateformallywhetherthepatternofeffectsisconsistentwithaposited
laborsupplyfunction.
Evenabsentmultipletreatmentarms,sometimesstatisticalortheoretical
modelsandassumptionscanenableresearcherstolearnaboutmechanismsthat
generateaprogrameffect.Forexample,CardandHyslop(2005)[henceforthCH]
analyzethedatafromtheCanadianSelfSufficiencyProgram(SSP)RCT.SSP,a
welfare-to-workprogram,combinedastrong,temporaryworkincentivefor
participatingworkerswithafixedinitialtimeperiodduringwhichwelfare
103
recipientshadtoestablisheligibilityintheprogrambyworkingfulltime.Asaresult
ofthistwo-tieredstructure,thesimpleexperimentanalysisdoesnotdistinguishthe
effectsofthevariouscomponentsoftheprogram.Thismakesitdifficulttocompare
theeffectsofSSPwithotherwelfare-to-workprograms,toassesshowSSPworked,
andtodrawlessonsforsimilarprograms.CHuseaparametricstatisticalmodelto
separatelyidentifytheeffectofthedifferentincentivesinherentintheSSPprogram.
Incontrasttostaticevaluationsofwelfare-to-workprograms,CHfocusonthe
dynamiclaborsupplyincentivesinherentintheprogram.
Onecannotdirectlyanalyzetheeffectofthesubsidy(whichinthefollowing
wewillrefertoastheSSPprogram)forthosewhobecameeligiblebecauseof
selectionintheeligibilitydecision.Onecan,however,modeleligibilityasatypeof
imperfectcompliance,permittingtheestimationoftheLATEofSSPontotal
employmentoronthefractionemployedatanygivenpointintime.Whenoneturns
todynamicanalyses,potentialdifferentialchangesinthenatureofselectioninthe
treatmentandcontrolgroupsmakeitimpossibletoestimatethedynamicresponses
ofhazardratesorwagesjustbasedontheRCT.44Inaddition,asinotherwelfare
evaluations,endogenousemploymentdecisionsmakeananalysisofwageoutcomes
problematic.Anotherissueisthatintheshortrunthestrongworkincentivearising
fromtheoptionvalueintheeligibilityperiodispotentiallyconfoundedwiththe
effectofthesubsidy.
44CHuseastandardsearchtheorytomodeltheincentivesofSSP,andcapturetheeffectofeligibilityandtheSSPsubsidyonlaborsupplyincentivesviatheireffectsonthereservationwage.Thesearchmodelclarifiesthatinthepresenceofheterogeneity,thepoolofworkersemployedatanygivenpointintimemaybeselected,whetherornottherealsoissampleselectionarisingfromemploymentdecisions(e.g.,HamandLalonde1996).
104
Toaddressthesedifficulties,CHproceedbydevelopingalogisticmodelwith
randomeffectsandheterogeneitytoestimateabenchmarkforwelfaretransitionsin
theabsenceofSSP(i.e.,forthecontrolgroup).Thismodelisthencombinedwith
parametricspecificationsofthetreatmenteffectsoverdifferentrangesofthe
programspell,asimpliedbyincentivesinherentinSSP.Thisstepincludesmodeling
theparticipationdecisionandwelfaretransitionsasfunctionsoftheSSPsubsidy
andcurrentandlaggedwelfarestatus.Akeyassumptiontherebyisthatthechosen
controlsforheterogeneityandthefunctionalformrestrictionsaresufficientto
controlforthedynamicselectionbiasintroducedbytheeligibilitywindow.CH
experimentwithdifferentspecificationsofheterogeneity,andprovideample
discussionofthegoodnessoffitofthemodel.Asaresultofthisexercise,theyare
abletoobtainseparateeffectsofeligibilityandSSP.Thisallowsthemtosimulatethe
effectsofdifferentcomponentsoftheprogramandcounterfactualpolicychanges
relatingtothetimepathofthesubsidy.
TheapproachandfindinginCHsuggestthatonemaynotneedastructural
modeltoseparatelyidentifymultipletreatmenteffects,thedynamiceffectsofa
program,ortosimulatetheeffectofalternativepolicies.However,anassumptionon
functionalformisrequired,aswellasharder-to-assessassumptionsontheformof
underlyingheterogeneity.
Toestimatemechanismsunderlyingtheeffectofexperimentalorpolicy
variation,otherpapershaveusedinsightsfromtheorytoaididentificationwithout
estimatingastructuralmodel.Forexample,Schmieder,vonWachter,andBender
(2016)useinsightsfromthestandardsearchmodeltoestimatetheeffectof
105
unemploymentdurationonwages.Arecurringquestionintheanalysisand
evaluationofwelfareandunemploymentprogramshasbeentheeffectof
employmentandunemploymentonproductivityandwages.Ifwagesrisewith
employmentduration,welfare-to-workprogramscanleadtosustainedlaborforce
participation.Incontrast,iflongernonemploymentdurationreduceswages,and
hencethedisincentivetowork,moregenerousbenefitscanleadtoawelfaretrap.
CardandHyslop(2005)findthatincreasedemploymentinthecourseofthe
CanadianSelf-SufficiencyProgramdidlittletoincreasewages.Incontrast,Grogger
(2005)findspositivewageimpactsofemploymentinthecontextofarandomized
evaluationofFlorida’swelfare-to-workprogram.
Fewpapershavedirectlyanalyzedtheeffectofunemploymentdurationon
wages.45Thequestionisdifficultforatleasttworeasons.First,asinCardand
Hyslop(2005),evenwithexogenousvariationinincentivesatthegrouplevel,the
typeofworkeremployedatanygivenpointintheunemploymentspellmaydiffer
betweenthetreatmentandcontrolgroups.46Inotherwords,itisdifficulttofinda
validinstrumentforthedurationofunemployment.Asecondcomplication
arisesbecauseevenifsuchvariationwasavailable,achangeinwagesmightarise
eitherbecauseofachangeinwageoffersorduetoachangeinreservationwages.
Toaddressthesedifficulties,Schmieder,vonWachter,andBender(2016)
usethefactthatthecanonicalsearchmodelhasthestrongpredictionthatforward-
45AnexceptionisAddisonandBlackburn(2000),whodiscusssomeoftheissuesthatarise.Alargernumberofpapershasaddressedthequestionofdurationdependenceinunemploymentspells.SeeKroft,Lange,andNotowidigdo(2013)andreferencestherein.46Thisbiasarisesevenintheabsenceofdifferencesinparticipation.
106
lookingindividualsvaluingfutureunemploymentinsurancebenefitswillrespondto
abenefitextensionbyraisingtheirreservationwagewellbeforebenefitexhaustion.
Unlessreservationwagesdonotbind,thisimpliesthatextensionsinUIdurations
shouldleadtoincreasesinobservedreemploymentwagesthroughoutthespell.In
contrasttothisprediction,Schmiederetal.(2016)findinthecontextof
discontinuousincreasesinunemploymentinsurancedurationsinGermany,that
reemploymentwagesatdifferentpointsoftheunemploymentspellsareunaffected.
Theydeducethatreservationwageslikelyhadlittleeffectonobservedwagesand
hencethattheeffectofanincreaseinUIbenefitdurationsonwagesarosefroman
effectoftheriseinnonemploymentdurationsonofferedwages.Inthiscase,an
exogenousincreaseinUIbenefitdurationscanbeusedasaninstrumenttoestimate
theeffectofnonemploymentdurationonwages.47
Anotherstudyincorporatingtheoreticalinsightsfromsearchtheoryintoan
empiricalstudyofunemploymentinsuranceisthatofDellaVigna,Lindner,Reizer,
andSchmieder(2016),whoanalyzeachangeinthetimepathofUIbenefitsin
Hungarythatkeptbenefitsinthefinaltierunchanged.Theyusethisvariationto
structurallyestimatekeyparametersofamodelwithreferencedependence,and
findthemodeldoesquitewellcomparedtoanalternativemodelthatexplainsthe
patternbasedon(unspecified)heterogeneity.Theincorporationofnon-standard
47Theauthorsarguethattheirtestexcludesanyaffectoftheworker'soutsideoptiononwages,andhencethefindingsarenotspecifictotheparticularmodel.
107
behavioralassumptionsintotheevaluationoflabormarketprogramisstillinits
infancy,butisanimportantavenueforfutureresearch.48
Acloselyrelatedtopictothequestionofmechanismsistheextrapolationof
experimentalevidencetoconsidertheimpactsofnewpolicies,notincludedinthe
originalevaluation.Thevalueofsuchextrapolationshaslongbeenoneofthe
primaryargumentsinfavorofstructuralmodeling(andagainstrelianceonpurely
experimentalevidence),butsomescholarshavefoundoutwaystosynthesizethe
approaches.Themainchallengehereistobridgebetweentherelativelyfew
parametersthatarecleanlyidentifiedbyanexperimentandthelargersetof
parametersthatareneededtocharacterizemoststructuralmodels.
Onewaytodothisistostartwithacharacterizationofstructuralbehavior
thatissimpleenoughtobecapturedwithintheexperimentalevidence.Forexample,
ifoneassumesthatthelaborsupplyfunctionischaracterizedbyconstantincome
and(compensated)substitutionelasticities,thentheestimatesoftheseparameters
thatareidentifiedbytheNITexperimentsaresufficienttoidentifytheeffectsof
alternativeNITparametersthatwerenotincludedintheexperimentaltreatments.
Adrawbackofsuchanapproachisthattherangeofpoliciesthatcanbeexaminedis
limited.Theapproachcanbeextended,ofcourse,toestimateamorecomplex
structuralmodelthateitherreliesonadditionalstatisticalandtheoretical
assumptions,additionalnon-experimentalmoments,orboth.Inanyevent,thissort
48Forsomeexceptions,see,LemieuxandMacLeod(2000),DellaVignaandPaserman(2005),Oreopoulos(2007);morerecently,Chan(2014)examinestheroleoftime-inconsistencyinthecontextoftherandomizedevaluationofFloridaTransitionProgram.Babcock,Congdon,Katz,andMullainathan(2012)giveanoverviewofthepotentialimportanceofbehavioralassumptionfortheevaluationofpublicprograms
108
ofexerciseisonmoresolidgroundwhentryingtointerpolatetovalueswithinthe
rangeoftaxparametersincludedintheexperimentthanwhentheseparameters
needtobeextrapolatedoutsideofthatrange.
Amorerecent,closelyrelatedapproachisknownasthe“sufficientstatistics”
approach(Chetty2009).Here,thegoalistocharacterizeoptimalpolicy.Starting
withafullycharacterized(butusuallynotoverlycomplex)structuralmodel,itis
oftenpossibletoderiveexpressionsforsocialwelfare,orfortheoptimalpolicy,that
dependonlyonasmallnumberofreduced-formparameters.Forexample,theBaily-
Chetty(Baily1978,Chetty2006)formulaforoptimalunemploymentinsurance
benefitsexpressestheoptimalbenefitlevelintermsoftheelasticityof
unemploymentdurationwithrespecttoUIbenefits,andtheincomeand
substitutioneffectsontheexithazardfromunemployment.Ifonehadexperimental
evidenceregardingtheseeffects,onecouldusetheformulatoderivetheoptimal
policy(e.g.,Chetty2008,Card,Chetty,andWeber2007).
Ofcourse,anysufficientstatisticsapproachisdependentuponthevalidityof
theunderlyingstructuralmodel–thereisnoassurancethatthetruestructural
modelgeneratesthesamesufficientstatisticsasdoestheonepositedbythe
researcher.Insomecases,thismayincludearelevantclassofmodelsandhence
provideadegreeofrobustness.Forexample,Chetty(2009)givestheexampleof
heterogeneityintreatmenteffects,wheretheoptimalpolicydependsonlyonthe
meaneffect.Yet,itcanbehardtoknowwhichassumptionsinthestructuralmodel
matter,andgenerallytheassumptionsneededtoderivethesufficientstatisticsare
fairlystrong.Atapracticallevel,conclusionsaboutoptimalpoliciesmayinvolve
109
extrapolatingveryfarfromtherangeofpolicyvariationincludedintheexperiment,
whichmeansrelyingstronglyonthevalidityofthetheoreticalmodel.Inthis
context,apotentialdrawbackofsufficientstatisticsisthatincontrasttoexplicitly
structuralworktheempiricalfitofthemodelagainstthedatacannotbeassessed.
Analternativeapproachtoobtainaframeworkforpolicyextrapolation
basedonexperimentalvariationistoestimate,orcalibrate,afullstructuralmodel,
usingexperimentalevidencetoaidinidentifying(someof)thenecessary
parameters.Oneapproachistofixindividualparametersatthevaluesindicatedby
experiments,thencalibrateorstructurallyestimatetheremainder.Thisapproachis
pursued,forexample,byDavidsonandWoodbury(1997),whousetheIllinois
reemploymentbonusexperimenttoestimatetheparametersofasearchcost
function,thencombinethisfunctionwithcalibratedvalues,derivedfromnon-
experimentaldata,forotherparametersoftheirmodelofoptimalUIbenefits.
Anotherapproachistouseexperimentaldatatofitafullstructuralmodel,butkeep
themodelsufficientlysimplesuchthatthemainparametersofthemodelare
identifiedbytheavailablevariation,asforexampleinDellaVigna,Lindner,Reizer,
andSchmieder(2016).Analternativeistoestimatethestructuralmodelsolelywith
non-experimentaldatatoestimateastructuralmodel,thenuseexperimental
evidencetovalidatepredictionsthatthemodelmakesforparticularreduced-form
comparisons(e.g.,ToddandWolpin2006).49
49Anotherapproachtoextrapolationthatcanbeviewedasahybridbetweenstructuralandreducedformapproachesisuseexperimentalvariationintheincentivetotakeupaprogramtoeffectivelyestimateastructuralmodelofthecompliancerate(e.g.,HeckmanandVytlacil2005).Asdescribedin
110
ii. Addressingtheissueexantethroughthedesignoftheexperiment
Insomecases,theexperimentaldesigncanbestructuredtohelpuncoverthe
mechanismsunderlyingthetreatmenteffectoftheprogram.Economictheorymay
beparticularlyusefulhereinconnectingfundamentalparametersandmechanisms
tothetypesofimpactsthatcanbemeasuredwithexperiments.Oneapproachisto
designanexperimentthattargetsaparticularmechanismofinterest,ratherthan
identifyingtheeffectofawell-definedprogramthatmightbeimplemented.Kling,
Congdon,Ludwig,andMullainathan(thisvolume)refertothisasa“mechanism
experiment,”distinguishingitfromaprogramevaluation.Standardmodelsinlabor
economicsorotherfieldsmayprovideusefulcharacterizationsofthebehavioral
mechanismstobetested.Forexample,modelsofhumancapitalinvestmenthave
implicationsforthefactorsdeterminingtakeupandsuccessoftrainingorschooling
programsthatmaybeusefulinstructuringtheexperimentaldesign.
Acloselyrelatedapproachistointroducemultipletreatmentarms,with
programvariationamongthemthatcanhelpuncoverunderlyingparameters.The
NITexperimentsdiscussedabovepresentastraightforwardexampleofacongenial
marriageofclassic(static)laborsupplytheoryandtheexperimentaldesign.As
discussedabove,aslongasbothincomeandsubstitutioneffectsarelinearinthe
relevanttaxmeasure,multipletreatmentsmanipulatingboththebasetransferand
Section5.d,undercertaincircumstancesthisallowsonetoobtainthefulldistributionofmarginaltreatmenteffectsandhencetoextrapolate.
111
themarginaltaxratecanbeusedtoseparatelyestimatetheincomeandsubstitution
effects.50
TheevaluationoftheSSPprogramdiscussedaboveisagoodexampleofan
experimentthatwouldhavebenefitedfromasecondtreatmentarm.Sucha
treatmentmighthaverandomlyvariedtheincentivetobecomeeligibleforthe
(randomlyassigned)worksubsidyinthemainphaseoftheprogram.More
generally,decisionsandprogramsinvolvinginter-temporaltradeoffsmaybean
areainwhichmorecomplexexperimentscanbeparticularlyinsightful.For
example,typicalUIsystemsinvolveexpiringbenefits,orJSAprogramsinvolve
sanctions;thetimingofbenefitexhaustion,reemploymentbonuses,orsanctionshas
beenshowntohaveimportantempiricaleffectsonreemploymentrates(e.g.,Meyer
1995,Black,Smith,Berger,andNoel2003,Schmieder,vonWachter,andBender
2012).Hence,experimentsthattryandgetattheunderlyingbehavioral
mechanismsmayprovideimportantinsightsintohowtheseprogramsaffectlabor
supplychoices.Knowledgeofsuchmechanismsisalsoacrucialinputinoptimizing
thedeliveryofinsuranceorassistanceinthelabormarket.Forexample,thiscould
involveareemploymentbonusthatdeclinesovertime,oronethatisavailableonly
tothosewhosurvivetoaspecifiedpoint.Byrandomlyvaryingtheamount,slope,or
intervals,onemaygaininsightsintothenatureofinter-temporaldecisionmaking
relevantfortheseprograms.Inter-temporalchoiceisalsoanareawheretheoryis
likelytobehelpfultoprovideidentifyingstructure.Forexample,ifthegoalwould
50Multipletreatmentsmaynotbenecessary.Forexample,withappropriatedataandassumptions,onecouldinprincipleexperimentallyvarycompensatedwagechangestoidentifythecompensatedsubstitutioneffect.Thismorecloselyresemblesamechanismexperiment.
112
betolearnaboutpotentialbehavioralbiases,amodeloftheeffectofparticular
biasescanyieldinsightfulpredictionsforjobsearchbehavior(e.g.,DellaVigna,
Lindner,ReizerandSchmieder2016).51
Theusefulnessoftheoryininformingexperimentaldesignshinges,ofcourse,
onthemodelbeingcorrect.Tomitigatetherelianceonparticularassumptions(e.g.,
onfunctionalforms)inprincipleonecoulduserevealedpreferenceargumentsto
generaterobustpredictionsfromtheorythatarethenusedindesignofan
experiment.E.g.,onecoulduseresultsobtainedbyPinto(2015)orKlineandTartari
(2016)todevisemultipletreatmentarmstotesttheimpliedrestrictions.However,
amodelmaynotbenecessarytoenrichtheexperimentaldesigntostudyunderlying
channels.TheSSPexampleshowsthatabasicunderstandingoftheincentivesand
thenatureoftheprogramcanbesufficienttodesignanRCTthatuncoversthe
potentiallycomplexmechanismsunderlyingthesimpleSSPevaluation.
V. Conclusion
Becausetheyallowresearcherstocontrolassignmentintotreatment,
randomizedcontrolledtrialsaretheGoldStandardforprogramevaluation.But
whilerandomassignmentsolvestheselectionproblem,thereareabroadrangeof
additionalrelevantdesignissuesthatariseroutinelyintheanalysisofcentral
economicquestionsthatarenotsolvedbyrandomassignmentonitsown.Inthis
51Asalreadymentionedinthediscussionofheterogeneoustreatmenteffects,anotherareawheretheoryislikelytobeusefulistounderstandthedeterminationofcompliancerates.Asdiscussedabove,themainideaistoexperimentallymanipulatetheincentivetoparticipateandusethevariationtotraceoutthemarginaltreatmenteffect(MTE)curve.Theoreticalconsiderationscantellushowtorealisticallyvarythecostofcomplianceandhencebeabletoestimatethefullrangeoftreatmenteffects.
113
chapter,wehavediscussedsixsuchdesignissuesindepth,including(1)spillover
effectsandinteractionsbetweenindividuals,leadingtoafailureofSUTVA;(2)
impactsonoutcomesthatareonlyobservedconditionalonindividualchoicesand
henceareendogenous,suchaswages,hoursworked,orparticipationinafollow-up
survey;(3)heterogeneityintreatmenteffectsbetweenexperimentalsitesand
observedpopulationgroups,or(4)imperfectcomplianceandheterogeneityin
unobservedcharacteristics,bothofwhichcanmakeithardtointerprettreatment
effectsandextrapolatetootherprograms;(5)hiddentreatmenteffectsarising
becausecontrolsalsoreceiveversionsofthetreatment;and(6)theunderstanding
ofthemechanismsbehindthetreatmenteffect,inparticularinthepresenceof
multipletreatment.
Wediscussthesedesignissuesandsolutionsinthecontextofsocial
experimentsintheUnitedStateslabormarket,whichhaveprovidedmostofwhat
weknowaboutthefunctioningofthemainlabormarketprograms.Ofcourse,the
laboreconomicsliteraturehasbeenwellawareaboutthelimitationsofexperiments
ingeneralandsomeofthesedesignissuesinparticular.Wehavereviewed
approachesthatcanbeusedtoaddressthedesignissuesinthecontextof
randomizedexperiments.Thisincludesapproachesthatcanbeappliedonce
randomizationiscompleted,andwaystomodifytheexperimentsitselftoaddress
theconcernsweidentify.
Whilewediscussdesignissuesinthecontextofexperimentsinthelabor
market,theseissuescanariseinallareasthathaveseenactiveexperimental
activities,includingfieldexperimentsdiscussedelsewhereinthisvolume.Hencethe
114
solutionsweidentifycanbeappliedtoabroadrangeofquestionsandshouldbe
usefulforawiderangeofresearchersinterestedinharnessingthepower
randomizedcontrolledtrials.
Weclosewithabriefdiscussionofrecenttrendsinlabormarketsocial
experiments,severalofwhichhighlighttheneedtopaymoreattentiontothe
potentialdesignissuesinexperimentalevaluationsthatwediscuss.One
overarchingtrend,cuttingacrossseveralareasofresearch,isthatacademic
economistshavebecomemoreinvolvedwiththeimplementationofexperiments.In
laboreconomics,forexample,thishasmeantashiftawayfromrandomized
controlledtrialsimplementedbylarge,specializedpolicyconsultingfirms(e.g.,
Mathematica,MDRC,orAbtAssociates).Forexample,severalexperimentshave
evaluatedtakeupofactualgovernmentprogramswithinthecontextofservices
providedbyH&RBlock(e.g.,Bettinger,Long,Oreopoulos,andSanbonmatsu2012).
Anotherexampleistheincreasingnumberofrandomizedtrialsevaluatingtherole
ofeconomicincentivesforteachers(e.g.,Fryer,Levitt,List,andSadoff2012;Fryer
2013;Springeretal.2010).Similarly,experimentstakingplacewithinprivate
businesseshavealsobeenquitesuccessful(e.g.,Bandiera,Barankay,andRasul
2009).
Thegreaterinvolvementofacademiceconomistsharborsbothupside
potential,ifresearchersimplementstate-of-the-arttechniquestoaddressadditional
designissues,andchallenges,asthereisabroadrangeofissuesthatmustbe
consideredandmonitoredwhenimplementinganexperimentalevaluationofan
existingprogramoranew,complextreatmentinareal-worldsetting.Wehopethe
115
discussionofthedesignissuesinthischapter,aswellasoursummaryofthe
practicalaspectsofimplementingsocialexperiments,willprovideausefulguidefor
thoseinterestedinimplementingsuchsocialexperiments.
Asecond,relatedtrendhasbeenamovementtowardevaluatingtopicsin
personneleconomics(e.g.,theresponseofteacherstoincentivepayprograms)as
distinctfromgovernmentsocialprograms.Theseareoftenconductedwithin
particularfirms,andimplicateanumberofthedesignissueswediscuss,most
notablyissuesofsiteeffectsandheterogeneity.
Athirdimportanttrendhasbeentheuseoftheactualonlinelabormarket,
forwhatamounttofieldexperimentsinthetaxonomywesetoutattheoutset(e.g.,
Pallais2014).TheInternetmaywellprovideausefulresourceforfuturesocial
experimentsaswell.Akeyadvantagemaybethatresearchersmaybebeableto
bettercontroltheenvironment,perhapsallowingthemtoimplementmorecomplex
studydesignsthataddresssomeoftheissueswepose.
References
Addison,J.T.,Blackburn,M.L.2000.Theeffectsofunemploymentinsuranceonpostunemploymentearnings.LabourEconomics,7(1),21-53.
Ahn,H.,Powell,J.L.1993.Semiparametricestimationofcensoredselectionmodelswithanonparametricselectionmechanism.JournalofEconometrics,58(1),3-29.
Alcott,H.2015.Siteselectionbiasinprogramevaluation.QuarterlyJournalofEconomics,130(3),1117-1165.
Altonji,J.G.,Blank,R.M.1999.Raceandgenderinthelabormarket.HandbookofLaborEconomics,3(3),3143-3259.
Anderson,M.2008.Multipleinferenceandgenderdifferencesintheeffectsofearlyintervention:AreevaluationoftheAbecedarian,PerryPreschool,andEarly
116
Trainingprojects.JournaloftheAmericanStatisticalAssociation,103(484),1481-1495.
Angrist,J.D.,Hull,P.,Pathak,P.A.,Walters,C.2015.Leveraginglotteriesforschoolvalue-added:Testingandestimation.(WorkingPaper21748).NationalBureauofEconomicResearch.
Angrist,J.D.,Imbens,G.W.1995.Two-stageleastsquaresestimationofaveragecausaleffectsinmodelswithvariabletreatmentintensity.JournaloftheAmericanStatisticalAssociation,90(430),431-442.
Angrist,J.D.,Imbens,G.W.,Rubin,D.B.1996.Identificationofcausaleffectsusinginstrumentalvariables.JournaloftheAmericanStatisticalAssociation,91(434),444-455.
Angrist,J.D.,Krueger,A.B.1999.Empiricalstrategiesinlaboreconomics."HandbookofLaborEconomics,3,1277-1366.
Ashenfelter,O.,Ashmore,D.,DeschênesO.2004.Dounemploymentinsurancerecipientsactivelyseekwork?EvidencefromrandomizedtrialsinfourUSstates.JournalofEconometrics,125(1-2),53-75.
Ashenfelter,O.,Plant,M.W.1990.Nonparametricestimatesofthelabor-supplyeffectsofnegativeincometaxprograms.JournalofLaborEconomics,8(1),S396-S415.
Athey,S.,Imbens,G.2016.Theeconometricsofrandomizedexperiments.HandbookofFieldExperiments(forthcoming).
Babcock,L.,Congdon,W.J.,Katz,L.F.,Mullainathan,S.2012.Notesonbehavioraleconomicsandlabormarketpolicy.IZAJournalofLaborPolicy,1(2),1-14.
Baily,M.N.1978.Someaspectsofoptimalunemploymentinsurance.JournalofPublicEconomics,10(3),379-402.
Baird,S.,Bohren,A.,McIntosh,C.,Ozler,B.2015.Designingexperimentstomeasurespillovereffects,secondversion(WorkingPaper15-021).PennInstituteforEconomicResearch.
Bandiera,O.,Bankaray,I.,Rasul,I.2009.Socialconnectionsandincentivesintheworkplace:Evidencefrompersonneldata.Econometrica,77(4):1047-1094.
Barnes,M.S.,Benus,J.,CooperJ.,Dugan,M.K.,KirschM.P.,Johnson,T.2014.U.S.DepartmentofLaborJobsCorpsProcessStudyFinalReport.U.S.DepartmentofLabor.[Availableat:http://wdr.doleta.gov/research/keyword.cfm?fuseaction=dsp_resultDetails&pub_id=2538&mp=y].
117
Barnow,B.S.2000.Exploringtherelationshipbetweenperformancemanagementandprogramimpact:AcasestudyoftheJobTrainingPartnershipAct.JournalofPolicyAnalysisandManagement,19(1),118-141.
Becerra,R.M.,Lew,V.,Mitchell,M.N.,Ono,H.1998.Finalreport:CaliforniaWorkPaysDemonstrationProject,reportofthefirstforty-twomonths.SchoolofPublicPolicyandSocialResearch,UniversityofCalifornia-LosAngeles,LosAngeles.
Beecroft,E.,Lee,W.,Long,D.,Holcomb,P.A.,Thompson,T.S.,Pindus,N.,O’Brien,C.,Bernstein,J.2003.TheIndianawelfarereformevaluation:Five-yearimpacts,implementation,costsandbenefits.AbtAssociates:Cambridge,MA.
Bell,S.H.,Bloom,H.S.,Cave,G.,Doolittle,F.,Lin,W.,Orr,L.L.1994.TheNationalJTPAStudy:Overview:Impacts,benefits,andcostsofTitleII-A.AbtAssociates:CambridgeMA
Bell,S.H.,Orr,L.L.,Burstein,N.R.1987.EvaluationoftheAFDCHomemaker-HomeHealthAideDemonstrations:Overviewofevaluationresults.AbtAssociates:CambridgeMA.
Benus,J.,Yamagata,E.P.,Wang,Y.,Blass,E.2008.ReemploymentandEligibilityAssessment(REA)study:FY2005initiative:Finalreport.IMPAQInternational,1-173.
Bertrand,M.,Mullainathan,S.2004.AreEmilyandGregmoreemployablethanLakishaandJamal?Afieldexperimentonlabormarketdiscrimination.AmericanEconomicReview,94(4),991-1013.
Bettinger,E.,Long,B.T.,Oreopoulos,P.,Sanbonmatsu,L.2012.Theroleofapplicationassistanceandinformationincollegedecisions:ResultsfromtheH&RBlockFAFSAexperiment.QuarterlyJournalofEconomics,127(3),1205-1242.
Bitler,M.P.,Gelbach,J.B.,Hoynes,H.W.2006.Whatmeanimpactsmiss:Distributionaleffectsofwelfarereformexperiments.TheAmericanEconomicReview,96(4),988-1012.
Black,D.A.,Galdo,J.,Smith,J.A.2007.EvaluatingtheWorkerProfilingandReemploymentServicesSystemusingaregressiondiscontinuityapproach.TheAmericanEconomicReview,97(2),104-107.
Black,D.A.,Smith,J.A.,Berger,M.C.,NoelB.J.2003.Isthethreatofreemploymentservicesmoreeffectivethantheservicesthemselves?EvidencefromrandomassignmentintheUIsystem.AmericanEconomicReview,93(4),1313-1327.
118
Bloom,H.S.,Hill,C.J.,RiccioJ.A.2005.Modelingcross-siteexperimentaldifferencestofindoutwhyprogrameffectivenessvaries.InBloom,H.S.,ed.,Learningmorefromsocialexperiments:Evolvinganalyticapproaches.RussellSageFoundation,37-74.
Bloom,D.,Kemple,J.J.,Morris,P.,Scrivener,S.,Verma,N.,Hendra,R.2000.FinalreportonFlorida’sinitialtime-limitedwelfareprogram.ManpowerDemonstrationResearchCorporation:NewYork,December.
Bloom,H.S.,Orr,L.L.,Bell,S.H.,Cave,G.,Doolittle,F.,Lin,W.,Bos,J.M.1997.ThebenefitsandcostsofJTPATitleII-Aprograms:KeyfindingsfromtheNationalJobTrainingPartnershipActStudy.JournalofHumanResources,32(3),549-576.
Bloom,D.,Scrivener,S.,Michalopoulos,C.,Morris,P.,Hendra,R.,Adams-Ciardullo,D.,Walter,J.2002.JobsFirst:FinalreportonConnecticut'swelfarereforminitiative.ManpowerDemonstrationResearchCorporation.
Blundell,R.,Bozio,A.,Laroque,G.2011.Laborsupplyandtheextensivemargin.TheAmericanEconomicReview,101(3),482-486.
Blundell,R.,Dias,M.C.,Meghir,C.,Reenen,J.V.2004.Evaluatingtheemploymentmmpactofamandatoryjobsearchprogram.JournaloftheEuropeanEconomicAssociation,2(4),569-606.
Brinch,C.,Mogstad,M.,Wiswall,M.Forthcoming.BeyondLATEwithadiscreteinstrument.JournalofPoliticalEconomy.
Buchinsky,M.1994.ChangesintheUSwagestructure1963-1987:Applicationofquantileregression.Econometrica:JournaloftheEconometricSociety,62(2),405-458.
Burghardt,J.,Schochet,P.Z.,McConnell,S.,Johnson,T.,Gritz,R.M.,Glazerman,S.,Homrighausen,J.,Jackson,R.2001.DoesJobCorpswork?SummaryoftheNationalJobCorpsStudy.MathematicaPolicyResearch:Princeton,NJ.
Card,D.,Chetty,R.,Weber,A.2007.Cash-on-handandcompetingmodelsofintertemporalbehavior:Newevidencefromthelabormarket.TheQuarterlyJournalofEconomics,122(4),1511-1560.
Card,D.,Hyslop,D.R.2005.Estimatingtheeffectsofatime-limitedearningssubsidyforwelfare-leavers.Econometrica,73(6),1723-1770.
Card,D.,Kluve,J.,Weber,A.2010.Activelabormarketprograms:Ameta-analysis.TheEconomicJournal,120(548),F452-477.
119
Cave,G.,Bos,H.,Doolittle,F.,Toussaint,C.1993.JOBSTART.Finalreportonaprogramforschooldropouts.ManpowerDemonstrationResearchCorp:NewYork.
Cerqua,A.,Pellegrini,G.2014.Dosubsidiestoprivatecapitalboostfirms'growth?Amultipleregressiondiscontinuitydesignapproach.JournalofPublicEconomics,109(C),114-126.
Chan,M.K.2014.Welfaredependenceandself-control:Anempiricalanalysis.Workingpaper,EconomicsDisciplineGroup,UTSBusinessSchool,UniversityofTechnology,Sydney.
Chetty,R.2006.Ageneralformulafortheoptimallevelofsocialinsurance.JournalofPublicEconomics,90(10),1879-1901.
Chetty,R.2008.Moralhazardversusliquidityandoptimalunemploymentinsurance.JournalofPoliticalEconomy,116(2),173-234.
Chetty,R.2009.Isthetaxableincomeelasticitysufficienttocalculatedeadweightloss?Theimplicationsofevasionandavoidance.AmericanEconomicJournal:EconomicPolicy,1(2),31-52.
Chetty,R.,Friedman,J.N.,Rockoff,J.E.2014.MeasuringtheimpactsofteachersI:Evaluatingbiasinteachervalue-addedestimates.AmericanEconomicReview,104(9),2593-2632.
Chodorow-Reich,G.,Karababounis,L.2016.Thelimitedmacroeconomiceffectsofunemploymentbenefitextensions(WorkingPaper22163).NationalBureauofEconomicResearch.
Coglianese,J.J.(WorkingPaper).2015.Dounemploymentinsuranceextensionsreduceemployment?Mimeo,HarvardUniversity.
Corson,W.,Decker,P.,Dunstan,S.M.,Kerachsky,S.1991.Pennsylvaniareemploymentbonusdemonstration:Finalreport(UnemploymentInsuranceOccasionalPaper92-1).U.S.DepartmentofLabor:Washington,DC.
Corson,W.,Long,D.,Nicholson,W.1984.EvaluationoftheCharlestonClaimantPlacementandWorkTestDemonstration.MathematicaPolicyResearch.
Crépon,B.,Duflo,E.,Gurgand,M.,Rathelot,R.,Zamora,P.2013.Dolabormarketpolicieshavedisplacementeffects?Evidencefromaclusteredrandomizedexperiment.TheQuarterlyJournalofEconomics,128(2),531-580.
Davidson,C.,Woodbury,S.A.1997.Optimalunemploymentinsurance.JournalofPublicEconomics,64(3),359-387.
120
Deaton,A.2010.Instruments,randomization,andlearningaboutdevelopment.JournalofEconomicLiterature,48(2),424-455.
Dehejia,R.H.,Wahba,S.2002.Propensityscore-matchingmethodsfornonexperimentalcausalstudies.ReviewofEconomicsandStatistics,84(1),151-161.
DellaVigna,S.,Lindner,A.,Reizer,B.,Schmieder,J.F.2016.Reference-dependentjobsearch:evidencefromHungary(WorkingPaper22257).NationalBureauofEconomicResearch.
DellaVigna,S.,Paserman,M.D.2005.Jobsearchandimpatience.JournalofLaborEconomics,23(3),527-588.
DiNardo,J.,Fortin,N.M.,Lemieux,T.1996.Labormarketinstitutionsandthedistributionofwages,1973-1992:ASemiparametricApproach.Econometrica,64(5),1001-1044.
Dorsett,R.,Hendra,R.,Robins,P.K.,Williams,S.2013.Canpost-employmentservicescombinedwithfinancialincentivesimproveemploymentretentionforwelfarerecipients?EvidencefromtheTexasEmploymentRetentionandAdvancementEvaluation.NIESRDiscussionPaperNo.409.
FarberH.S.,Silverman,D.,Wachter,T.2015.Factorsdeterminingcallbackstojobapplicationsbytheunemployed:Anauditstudy(WorkingPaper21689).NationalBureauofEconomicResearch.
Fein,D.J.,Beecroft,E.,Blomquist,J.D.1994.OhioTransitionstoIndependenceDemonstration.FinalimpactsforJOBSandworkchoice.AbtAssociates:Cambridge,MA.
Feller,A.,Grindal,T.,Miratrix,L.W.,Page,L.C.2014.Comparedtowhat?Variationintheimpactsofearlychildhoodeducationbyalternativecare-typesettings.Workingpaper.
Ferracci,M.,Jolivet,G.,vandenBerg,G.J.2010.Treatmentevaluationinthecaseofinteractionswithinmarkets(No.4700).Workingpaper,InstitutefortheStudyofLabor(IZA).
Fraker,T.,Maynard,R.1987.Theadequacyofcomparisongroupdesignsforevaluationsofemployment-relatedprograms.JournalofHumanResources,22(2),194-227.
Freedman,S.,Friedlander,D.,Riccio,J.1994.GAIN:Benefits,costs,andthree-yearimpactsofawelfare-to-workprogram.ManpowerDemonstrationResearchCorp.
121
Freedman,S.,Knab,J.T.,Gennetian,L.A.,Navarro,D.2000.TheLosAngelesJobs-FirstGAINEvaluation:Finalreportonaworkfirstprograminamajorurbancenter.ManpowerDemonstrationResearchCorporation:NewYork.
Fryer,R.,2013.Teacherincentivesandstudentachievement:EvidencefromNewYorkCitypublicschools.JournalofLaborEconomics,31(2),373-427.
Fryer,R.,Levitt,S.D.,List,J.,Sadoff,S.2012.Enhancingtheefficacyofteacherincentivesthroughlossaversion:Afieldexperiment(WorkingPaper18237).NationalBureauofEconomicResearch.
Gautier,P.A.,Muller,P.,Rosholm,M.,Svarer,M.,vanderKlaauw,B.2012.Estimatingequilibriumeffectsofjobsearchassistance(No.9066).CEPRDiscussionPapers.
Gold,S.F.,1971.ThefailureoftheWorkIncentive(WIN)program.UniversityofPennsylvaniaLawReview,119(3),485-501.
Greenberg,D.H.,Robins,P.K.1986.Thechangingroleofsocialexperimentsinpolicyanalysis.JournalofPolicyAnalysisandManagement,5(2),340-362.
Greenberg,D.H.,Shroder,M.2004.Thedigestofsocialexperiments.TheUrbanInstitute,3rdedition.
GreenbergD.H.,Shroder,M.,Onstott,M.1999.Thesocialexperimentmarket.TheJournalofEconomicPerspectives,13(3),157-172.
Grogger,J.2005.Welfarereform,returnstoexperience,andwages:Usingreservationwagestoaccountforsampleselectionbias.TheReviewofEconomicsandStatistics,91(3),490-502.
Gronau,R.1973.Theeffectofchildrenonthehousewife'svalueoftime.JournalofPoliticalEconomy,81(2),S168-S199.
Grossman,J.B.,Roberts,J.,1989.Welfaresavingsfromemploymentandtrainingprogramsforwelfarerecipients.TheReviewofEconomicsandStatistics,71(3),532-537.
Gueron,J.Forthcoming.Thepoliticsandpracticeofsocialexperiments:seedsofarevolution.HandbookofFieldExperiments.
Hagedorn,M.,Karahan,F.,Manovskii,I.,Mitman,K.2015.UnemploymentbenefitsandunemploymentintheGreatRecession:theroleofmacroeffects.FederalReserveBankofNewYorkStaffReport646,revisedFebruary2015.
Hagedorn,M.,Manovskii,I.,Mitman,K.2015.Theimpactofunemploymentbenefitextensionsonemployment:The2014employmentmiracle?(WorkingPaper20884).NationalBureauofEconomicResearch.
122
Ham,J.C.,LaLonde,R.J.1996.Theeffectofsampleselectionandinitialconditionsindurationmodels:Evidencefromexperimentaldataontraining.Econometrica:JournaloftheEconometricSociety,64(1),175-205.
Ham,J.C.,Li,X.,Reagan,P.B.2011.Matchingandsemi-parametricIVestimation,adistance-basedmeasureofmigration,andthewagesofyoungmen.JournalofEconometrics,161(2),208-227.
Hamilton,G.,Freedman,S.,Gennetian,L.,Michalopoulos,C.,Walter,J.2001.Nationalevaluationofwelfare-to-workstrategies:Howeffectivearedifferentwelfare-to-workapproaches?Five-yearadultandchildimpactsforelevenprograms.USDepartmentofHealthandHumanServicesandUSDepartmentofEducation:Washington,DC.
Hamilton,G.andS.Scrivener.2012.Increasingemploymentstabilityandearningsforlow-wageworkerslessonsfromtheEmploymentRetentionandAdvancement(ERA)project.OfficeofPlanning,ResearchandEvaluationReport2012-19.AdministrationforChildrenandFamilies,U.S.DepartmentofHealthandHumanServices.
Harrison,G.W.,List,J.A.2004.Fieldexperiments.JournalofEconomicLiterature,42(4),1009-1055.
Hausman,J.A.1985.Theeconometricsofnonlinearbudgetsets.Fisher-ShultzlecturefortheEconometricSociety,Dublin:1982.Econometrica,53(6),1255-1282.
Hausman,J.A.,Wise,D.A.1979.Attritionbiasinexperimentalandpaneldata:TheGaryIncomeMaintenanceExperiment.Econometrica,47(2),455-73.
Heckman,J.J.1979.Sampleselectionbiasasaspecificationerror.Econometrica,47(1),153-61.
Heckman,J.J.2010.Buildingbridgesbetweenstructuralandprogramevaluationapproachestoevaluatingpolicy.JournalofEconomicLiterature,48(2),356-98.
Heckman,J.,Hohmann,N.,Smith,J.,Khoo,M.2000.Substitutionanddropoutbiasinsocialexperiments:Astudyofaninfluentialsocialexperiment.TheQuarterlyJournalofEconomics,115(2),651-694.
Heckman,J.J.,Hotz,V.J.1989.Choosingamongalternativenonexperimentalmethodsforestimatingtheimpactofsocialprograms:Thecaseofmanpowertraining.JournaloftheAmericanstatisticalAssociation,84(408),862-874.
Heckman,J.J.,LaLonde,R.J.,Smith,J.A.1999.Theeconomicsandeconometricsofactivelabormarketprograms.HandbookofLaborEconomics,3,1865-2097.
123
Heckman,J.J.,Smith,J.A.1995.Assessingthecaseforsocialexperiments.TheJournalofEconomicPerspectives,9(2),85-110.
Heckman,J.J.,Smith,J.,Clements,N.1997.Makingthemostoutofprogrammeevaluationsandsocialexperiments:accountingforheterogeneityinprogrammeimpacts.ReviewofEconomicStudies,64(4),487-535.
Heckman,J.J.,Vytlacil,E.2005.Structuralequations,treatmenteffects,andeconometricpolicyevaluation.Econometrica,73(3),669-738.
Herrem,J.W.,Schmitt,L.C.1983.Eligibilityreviewpilotprojecthandbook.WisconsinDepartmentofIndustry,Labor,andHumanRelations:Madison,WI.
Holland,P.W.1986.Statisticsandcausalinference.JournaloftheAmericanStatisticalAssociation,81(396),945-960.
Horowitz,J.L.,Manski,C.F.2000.Nonparametricanalysisofrandomizedexperimentswithmissingcovariateandoutcomedata.JournaloftheAmericanStatisticalAssociation,95(449),77-84.
Hotz,J.1992.Recentexperienceindesigningevaluationsofsocialprograms:ThecaseoftheNationalJTPAstudy.InGarfinkel,I.,Manski,C.,eds.,Evaluatingwelfareandtrainingprograms,Cambridge,MA:HarvardUniversityPress:76-114.
Hotz,J.,Imbens,G.,Klerman,J.2006.Evaluatingthedifferentialeffectsofalternativewelfare-to-worktrainingcomponents:AreanalysisofthecaliforniaGAINprogram.JournalofLaborEconomics,24(3),521-566.
Jackson,K.C.,Rockoff,J.E.,Staiger,D.O.2014.Teachereffectsandteacher-relatedpolicies.Annu.Rev.Econ,6(1),801-825.
Jacobson,L.S.2009.Strengtheningone-stopcareercenters:Helpingmoreunemployedworkersfindjobsandbuildskills.HamiltonProjectDiscussionPaper2009-01,April:TheBrookingsInstitution,WashingtonDC.
Jaggers,M.1984.ERPpilotprojectfinalreport.WisconsinDepartmentofIndustry,Labor,andHumanRelations:Madison,WI.
Johnson,T.R.,Pfiester,J.M.,West,R.W.,Dickinson,K.P.1984.Designandimplementationoftheclaimantplacementandworktestdemonstration.SRIInternational:MenloPark,CA.
Johnson,W.,Kitamura,Y.,Neal,D.2000.Evaluatingasimplemethodforestimatingblack-whitegapsinmedianwages.AmericanEconomicReview,90(2),339-343.
124
Johnston,A.C.,Mas,A.2015.Potentialunemploymentinsurancedurationandlaborsupply:Theindividualandmarket-levelresponsetoabenefitcut.Unpublishedworkingpaper.PrincetonUniversity.
Kane,T.J.,Staiger,D.O.2008.Estimatingteacherimpactsonstudentachievement:Anexperimentalevaluation(WorkingPaper14607).NationalBureauofEconomicResearch.
Keane,M.P.2010.Structuralvs.atheoreticapproachestoeconometrics.JournalofEconometrics,156(1),3-20.
Keeley,M.C.,Robins,P.K.,Spiegelman,R.G.,West,R.W.1978.Theestimationoflaborsupplymodelsusingexperimentaldata.TheAmericanEconomicReview,68(5),873-887.
Kehrer,K.C.,Moffitt,R.A.,eds.1976.TheGaryincomemaintenanceexperiment:Initialfindingsreport.IndianaUniversity:Gary,Ind.
Kemple,J.J.,Friedlander,D.,FellerathV.1995.Florida'sProjectIndependence.Benefits,costs,andtwo-yearimpactsofFlorida'sJOBSprogram.ManpowerDemonstrationResearchCorporation:NewYork.
Kershaw,D.,Fair,J.1976.TheNewJerseyincomemaintenanceexperiment.Volume1:Operations,SurveysandAdministration.AcademicPress:NewYork.
Klepinger,D.H.,Johnson,T.R.,Joesch,J.M.,Benus,J.M.1997.EvaluationoftheMarylandunemploymentinsuranceworksearchdemonstration(UnemploymentInsuranceOccasionalPaper98-2).U.S.DepartmentofLabor,EmploymentandTrainingAdministration,UnemploymentInsuranceService:WashingtonDC.
Klepinger,D.H.,Johnson,T.R.Joesch,J.M.,2002.Effectsofunemploymentinsurancework-searchrequirements:TheMarylandexperiment.Industrial&LaborRelationsReview,56(1),pp.3-22.
Klerman,J.A.,Minzner,A.,Harkness,J.,Mills,S.,Cook,R.,Savidge-Wilkins,G.2013.Designreport:ImpactevaluationofreemploymentandeligibilityassessmentProgram.AbtAssociates:May7.
Kline,P.,Tartari,M.2016.Boundingthelaborsupplyresponsestoarandomizedwelfareexperiment:Arevealedpreferenceapproach.AmericanEconomicReview,106(4),972-1014.
Kline,P.,Walters,C.2014.Evaluatingpublicprogramswithclosesubstitutes:ThecaseofHeadStart.UCBerkeleyInstituteforResearchonLaborandEmploymentWorkingPaper#123-14.
125
Kling,J.R.,Liebman,J.B.,Katz,L.F.2007.Experimentalanalysisofneighborhoodeffects.Econometrica,75(1),83-119.
Kling,J.R.,J.Ludwig,B.CongdonandS.Mullainathan.Socialpolicy:Mechanismexperimentsandpolicyevaluations.HandbookofFieldExperiments(forthcoming).
Knox,V.W.,Miller,C.,Gennetian,L.A.2000.Reformingwelfareandrewardingwork:AsummaryofthefinalreportontheMinnesotaFamilyInvestmentProgram(Vol.8).ManpowerDemonstrationResearchCorporation,NewYork.
Kornfeld,R.,Bloom,H.S.1999.Measuringprogramimpactsonearningsandemployment:Dounemploymentinsurancewagereportsfromemployersagreewithsurveysofindividuals?JournalofLaborEconomics,17(1):168-97.
Kroft,K.,Lange,F.,Notowidigdo,M.J.2013.Durationdependenceandlabormarketconditions:Evidencefromafieldexperiment.TheQuarterlyJournalofEconomics,128(3),1123-1167.
Krueger,A.B.,Mueller,A.I.2016.Acontributiontotheempiricsofreservationwages.AmericanEconomicJournal:EconomicPolicy,8(1),142-179.
LaLonde,R.J.1986.Evaluatingtheeconometricevaluationsoftrainingprogramswithexperimentaldata.TheAmericanEconomicReview,76(4),604-620.
Landais,C.,Michaillat,P.,Saez,E.2015.Amacroeconomictheoryofoptimalunemploymentinsurance(WorkingPaper16526).NationalBureauofEconomicResearch.
Lee,D.S.2009.Training,wages,andsampleselection:estimatingsharpboundsontreatmenteffects.TheReviewofEconomicStudies,76(3),1071-1102.
Lemieux,T.,MacLeod,W.B.2000.Supplysidehysteresis:ThecaseoftheCanadianunemploymentinsurancesystem.JournalofPublicEconomics,78(1),139-170.
List,J.A.,RasulI.2011.Fieldexperimentsinlaboreconomics.HandbookofLaborEconomics,4(4),103-228.
Maguire,S.,Freely,J.,Clymer,C.,Conway,M.,Schwartz,D.2010.Tuningintolocallabormarkets:Findingsfromthesectoralemploymentimpactstudy.Public/PrivateVentures:NewYork.
ManpowerDemonstrationResearchCorporationBoardofDirectors.1980.Summaryandfindingsofthenationalsupportedworkdemonstration.BallingerPublishingCompany:Cambridge,MA.
126
Meyer,B.D.1995.LessonsfromtheUSunemploymentinsuranceexperiments.JournalofEconomicLiterature,33(1),91-131.
Mihaly,K.,MaCaffreyD.F.,StaigerD.O.,LockwoodJ.R.2013.Acompositeestimatorofeffectiveteaching.MetProject.[Availableat:http://www.metproject.org/downloads/MET_Composite_Estimator_of_Effective_Teaching_Research_Paper.pdf].
Miller,C.,VanDok,M.,Tessler,B.L.,Pennington,A.2012.Strategiestohelplow-wageworkersadvance:ImplementationandfinalimpactsoftheWorkAdvancementandSupportCenter(WASC)demonstration.ManpowerDemonstrationResearchCorp:NewYork.
MinnesotaDepartmentofJobsandTraining.1990.Re-employMinnesota.InJohnson,E.R.,eds.,Reemploymentservicestounemployedworkershavingdifficultybecomingreemployed(UnemploymentInsuranceOccasionalPaper90-2).U.S.DepartmentofLabor,EmploymentandTrainingAdministration,UnemploymentInsuranceService:Washington,DC.
Moffitt,R.A.1979.ThelaborsupplyresponseintheGaryexperiment.JournalofHumanResources,14(4),477-487.
Newey,W.,Powell,J.L.,Walker,J.R.1990.Semiparametricestimationofselectionmodels:Someempiricalresults.AmericanEconomicReview,80(2),324-28.
O'Leary,C.J.(2006).StateUIjobsearchrulesandreemploymentservices.MonthlyLaborReview,129(6),27–37.http://research.upjohn.org/jrnlarticles/3.
Oreopoulos,P.2007.Dodropoutsdropouttoosoon?Wealth,healthandhappinessfromcompulsoryschooling.JournalofPublicEconomics,91,2213-2229.
Pallais,A.2014.Inefficienthiringinentry-levellabormarkets.AmericanEconomicReview,104(11),3565-3599.
Palmer,J.L.,Pechman,J.A.1978.Welfareinruralareas:theNorthCarolina-Iowaincomemaintenanceexperiment.BrookingsInstitution:Washington,DC.
Perez-Johnson,I.,Q.Moore,andR.Santillano.2011.Improvingtheeffectivenessofindividualtrainingaccounts:Long-termfindingsfromanexperimentalevaluationofthreeservicedeliverymodels.FinalReport.Mathematica,Inc.
Pinto,R.2015.Selectionbiasinacontrolledexperiment:ThecaseofMovingtoOpportunity.Mimeo.,UniversityofChicago.
Poe-Yamagata,E.,J.Benus,N.Bill,H.Carrington,M.Michaelides,andT.Shen.2011.ImpactoftheReemploymentandEligibilityAssessment(REA)initiative.ImpaqInternational.
127
Powell,J.L.1984.Leastabsolutedeviationsestimationforthecensoredregressionmodel.JournalofEconometrics,25(3),303-325.
Robins,P.K.1985.AComparisonofthelaborsupplyfindingsfromthefournegativeincometaxexperiments.JournalofHumanResources,20(4)567-582.
Rothstein,J.2010.Teacherqualityineducationalproduction:Tracking,decay,andstudentachievement.QuarterlyJournalofEconomics,125(1),175-214.
Rothstein,J.2016.Revisitingtheimpactsofteachers.Unpublishedworkingpaper.http://eml.berkeley.edu/~jrothst/workingpapers/rothstein_cfr.pdf.
Schmieder,J.F.,vonWachter,T.,Bender,S.2012.Theeffectsofextendedunemploymentinsuranceoverthebusinesscycle:Evidencefromregressiondiscontinuityestimatesover20years.QuarterlyJournalofEconomics,127(2),701-752.
Schmieder,J.F.,vonWachter,T.,Bender,S.2016.Theeffectofunemploymentbenefitsandnonemploymentdurationsonwages.AmericanEconomicReview,106(3),739-777.
Schochet,P.Z.,Burghardt,J.A.2008.DoJobCorpsperformancemeasurestrackprogramimpacts?JournalofPolicyAnalysisandManagement,27(3),556-576.
Schochet,P.,Burghardt,J.,McConnell,S.2008.DoesJobCorpswork?Impactfindingsfromthenationaljobcorpsstudy.MathematicaPolicyResearch.
Smith,J.A.,Todd,P.E.2005.DoesmatchingovercomeLaLonde'scritiqueofnonexperimentalestimators?JournalofEconometrics,125(1),305-353.
Spiegelman,R.G.,O'Leary,C.J.,Kline,K.J.1992.TheWashingtonReemploymentBonusexperiment:Finalreport(UnemploymentInsuranceOccasionalPaper92-6).U.S.DepartmentofLabor:Washington,DC.
Springer,MatthewG.,DaleBallou,LauraS.Hamilton,Vi-NhuanLe,J.R.Lockwood,DanielF.McCaffrey,MatthewPepper,andBrianM.Stecher.2010.Teacherpayforperformance:ExperimentalevidencefromtheProjectonIncentivesinTeaching.Conferencepaper,NationalCenteronPerformanceIncentives.
SRIInternational.1983.FinalreportoftheSeattle-Denverincomeexperiment,VolumeI:Designandresults.U.S.DepartmentofHealthandHumanServices:Washington,DC.
Steinman,J.P.1978.TheNevadaclaimantplacementprogram.EmploymentSecurityResearch,NevadaEmploymentSecurityDepartment.
128
Todd,P.E.,Wolpin,K.I.2006.AssessingtheimpactofaschoolsubsidyprograminMexico:Usingasocialexperimenttovalidateadynamicbehavioralmodelofchildschoolingandfertility.AmericanEconomicReview,96(5),1384-1417.
USDepartmentofHealth,Education,andWelfare.1976.Summaryreport:Ruralincomemaintenanceexperiment.GovernmentPrintingOffice:Washington,DC.
Vytlacil,E.2002.Independence,monotonicity,andlatentindexmodels:Anequivalenceresult.Econometrica,70(1),331-341.
Walters,C.2014.Inputsintheproductionofearlychildhoodhumancapital:EvidencefromHeadStart(WorkingPaper20639).NationalBureauofEconomicResearch.
Watts,H.W.,Rees,A.,1977a.TheNewJerseyIncomeMaintenanceExperiment,Vol.II:Laborsupplyresponses.AcademicPress:NewYork.
Watts,H.W.,Rees,A.,1977b.TheNewJerseyIncomeMaintenanceExperiment,Vol.III:Expenditures,health,andsocialbehavior,andthequalityoftheevidence.AcademicPress:NewYork.
Woodbury,S.A.,Spiegelman,R.G.1987.Bonusestoworkersandemployerstoreduceunemployment:RandomizedtrialsinIllinois.TheAmericanEconomicReview,77(4),513-530.
TargetPopulation
PrimaryIntervention
SecondaryIntervention
ExperimentTitleStartDate
Cost(nominal$)
SampleSize Treatment FundingSource OutcomesofInterest
(1)
Totalfamilyincomenotexceeding150percentofthepoverty
level
Negativeincometax
NewJerseyIncomeMaintenanceExperiment
1968 $7,800,000725-Treatment632-Control1,357-Total
Eightcombinationsofincomeguaranteesandtaxratesonotherincome. OEO (1)Reductioninworkeffortand(2)
Lifestylechanges
(2)Rural,low-incomefamilies
Negativeincometax
RuralIncomeMaintenanceExperiment
1970 $6,100,000269-Treatment318-Controls587-Total
Fivenegativeincometaxplans.TheFordFdn.,OEOOfficeofEconomic
Opportunity
(1)Workbehavior;(2)Health,school,andothereffectsonpoorchildren;and(3)Savingsand
consumptionbehavior
(3)
Familyearninglessthan$11,000
in1971dollars
Negativeincometax
Vocationaltraining
Seattle-DenverIncome
MaintenanceExperiment
1970 $77,500,000
1,801-Treatment1946-Treatment21,012-Treatment31,041-
Control
Twotypesoftreatment:anegativeincometaxplanandasubsidytovocationaltraining. HEW,HHS
(1)Effectsonlaborsupply;(2)Martialstability;and(3)Other
lifestylechanges.
(4)
Blackfamilieswithatleastonechild
undertheageof18
Negativeincometax
GaryIncomeMaintenanceExperiment
1971 $20,300,0001,028-Treatment771-Control1,799-Total
Fourcombinationsofguaranteeandtax. HEW
(1)Employment;(2)Schooling;(3)Infantmortalityandmorbidity;(4)Educationalachievement;and(5)
Housingconsumption
(5)
One-andtwo-parentfamiliesreceivingAFDC
Earnedincomedisregard
CaliforniaWorkPaysDemonstrationProgram(CWPDP)
1993 $4,500,000
6,278-Treatment13,471-Treatment23,276-Control11,695-Control214,720-Total
ThetreatmentinvolvedchangingtwoprovisionsoftheAFDCprogram.The"$30andone-third"provisionappliedtoallAFDCfamiliesand
allowedwelfarerecipientstokeepthefirst$30andone-thirdoftheremainingwagesbeforewelfaregrantdeterminationsweremade.
However,itexpiredaftertherecipienthadbeenintheprogramforfourmonths,andthere-afterdollar-for-dollarreductionsingrantoccurredforeverydollarofearnings.Underthe100-hourrule,whichappliedonlytotwo-parentfamilies,the
totalworkhourspermonthfortheprimarywageearnercouldnotexceed100hourswithoutlossofeligibility.Experimentalsreceivedawaiverofthetimelimitonthe$30andone-thirdincomedisregard,andawaiverofthe100-hourrule.
However,thecashgrantsofexperimentalswerereducedby8.5percent.ControlsweresubjecttothegeneralAFDCrules,withexpiringdisregards,ineligibilityafter100hours,andhigherbenefits.
CADeptofSocialServices
(1)Employment;(2)Earnings;and(3)Welfarereceipt
Table1:DetailsonSelectedRandomizedControlledTrialsofWelfareProgramsandOtherLaborSupplyIncentivesforLow-IncomeWorkersintheUnitedStates
(6)Familieson
AFDCEarnedincomedisregard
Individualjobsearch
assistanceCase
management
FloridaFamilyTransitionProgram
(FTP)1994 $11,200,000
1,400-Treatment1,400-Control2,800-Total
Limitedwelfarebenefitsunless"job-ready",enhancedearningsdisregard,andintensivecase
management
FLDeptofChildrenandFamilies
USDepartmentofHealthandHuman
Services
(1)Earnings;(2)Welfarebenefitreceipts;and(3)Outcomesor
children
(7)
AFDCrecipientand
recentapplicantfamilies
Reemploymentbonus
Earnedincomedisregard
JobsearchinventiveChildcareservices
MinnesotaFamilyInvestment
Program(MFIP)1994 $5,090,300
5,275-Treatment11,933-Treatment25,634-Treatment31,797-Control14,639-Total
MFIPprovideda20percentgrantincreasewhenrecipientsbecameemployed,increasedthelevelofincomethatwouldbedisregardedingrant
calculation,anpaidthechildcaresubsidydirectlytocaregiver.Two-parentfamilieswerenot
subjecttoworkhistoryrequirementsortothe100-hourrule.Bothsingle-parentandtwo-parent
familiesassignedtoMFIPweresubjecttomandatoryparticipationinemploymentservices.
RulesandproceduresweresimplifiedbycombiningFoodStamps,AFDC,andMinnesota'sFamilyGeneralAssistance(FGA)toformasinglecashbenefitprogram.SubjectsassignedtotheMFIPincentives-onlygroupreceivedidenticalbenefitsasMFIP,butwerenotrequiredtoparticipateintrainingservices.Twoother
groups.
MNDeptofHumanServices;FordFdn.;HHS;USDepartmentofAgriculture;CharlesStewartMottFdn.;AnnieECaseyFdn.;McKnightFdn.;
NorthwestAreaFdn.
(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Totalfamilyincome;and(5)Othermeasuresof
childandfamilywell-being
(8)AFDC
recipients
EarnedincomedisregardTimelimit
JobsearchincentivesVocationaltraining
ConnecticutJobsFirst
1996 $5,400,0002138-Treatment1821-Control3959-Total
EarningsdisregardedbelowthefederalpovertylevelandrequiredtoparticipateinJobSearch
SkillsTraining.
CTDeptofSocialServices
(1)Employment;(2)Earnings;(3)Benefitreceipt;and(4)Othermeasuresofchildwell-being
(9) UIclaimants Reemploymentbonus
IllinoisUnemployment
InsuranceIncentiveExperiment
1984 $800,000
4,186-Treatment(claimants)
3,963-Treatment(employers)3,963-Control12,112-Total
Unemployedwereoffereda$500bonusiffoundajobwithin11weeksandhelditfor4months.
ILDeptofEmploymentSecurity;WEUpjohnInstituteforEmployment
Research
(1)Reductionsinunemploymentspellsand(2)Netprogramsavings.
(10) UIclaimants Reemploymentbonus
Jobsearchworkshop
PennsylvaniaReemployment
BonusDemonstration
1988 $990,00014,086-Treatment3,392-Control17,478-Total
Fivecombinationsofbonusamountandqualificationperiod.
DOL(1)UIreceipt;(2)Employment;and
(3)Earnings
(11) UIclaimants Reemploymentbonus
WashingtonStateReemployment
BonusExperiment1988 $450,000
12,451-Treatment3,083-Control15,534-Total
6variationsofreemploymentbonusamountandqualificationperiods.
AlfredPSloanFdn.USDOL,ETA
(1)Weeksofinsuredunemploymentand(2)UIreceipt
Sources:(1)KershawandFair,1976;WattsandRees,1977aand1977b;(2)USDepartmentofHealth,Education,andWelfare1976;PalmerandPechman,1978;(3)SRIInternational,1983;(4)Kehrer,McDonald,andMoffit,1980;(5)Becerra,Lew,Mitchell,andOno,1998;(6)Bloom,Kemple,Morris,Scrivener,Verma,andHendra,2000;(7)Knox,Miller,andGennetian,2000;(8)Bloom,Scrivener,Michalopoulos,Morris,Hendra,Adams-Ciardullo,andWalter,2002;(9)WoodburyandSpiegelman,1987;(10)Corson,Decker,Dunstan,andKerachsky,1991;(11)Spiegelman,O'Leary,andKline,1992.
Abbreviations:DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=Foundation;OEO=OfficeofEconomicOpportunity;HEW=USDepartmentofHealth,Education,andWelfare;HHS=USDepartmentofHealthandHumanServices.
Target
Population
Primary
Intervention
Secondary
InterventionExperimentTitle
Start
Date
Cost
(nominal$)SampleSize
T
o
t
a
Treatment FundingSource OutcomesofInterest
(1)
AFDC
recipients,ex-
offenders,
substance
abusers,and
highschool
dropouts
WorkexperienceNationalSupportedWorkDemonstration
(NSWD)1975 $82,400,000
3,214-Treatment3,402-Control6,616-Total
Employmentinastructuredworkexperienceprograminvolvingpeergroupsupport,agraduatedincreaseinworkstandards,andclosesympatheticsupervision,for12to18months.
DOL,ETA;DOJ;LawEnforcementAssistance
Administration;HHS;NationalInstituteonDrugAbuse;HUD;USDepartmentofCommerce;
FordFdn.
(1)Increasesinpost-treatmentearnings;(2)Reductionsincriminalactivity;(3)Reductionsintransferspayments;and(4)Reductionsindrug
abuse
(2) AFDCrecipients WorkexperienceAFDCHomemaker--HomeHealthAideDemonstrations
1983 $8,000,0004,750-Treatment4,750-Control9,500-Total
ExperimentalAFDCsubjects(trainees)receivedafour-toeight-weektrainingcoursetobecomeahomemaker-homehealthaide,followedbyayearof
subsidizedemployment.Controlsubjectsdidnotreceivethistraining,nordidthey
receivesubsidizedemployment.
HealthCareFinancingAdministration
(1)Employment;(2)Earnings;and(3)AFDCandfoodstamppaymentsand
receipt
(3)
EligibleJob
Training
PartnershipAct
TitleIIadults
andout-of-
schoolyouth
VocationaltrainingGeneraleducationWorkexperienceOn-the-job-training
Individualjobsearchassistance
NationalJobTrainingPartnershipAct(JTPA)
Study1987 $23,000,000 20,602
Classroomtraining,on-the-jobtraining,jobsearchassistance,basiceducation,and
workexperience.DOL
(1)Earnings;(2)Employment;(3)Welfarereceipt;and(4)Attainmentof
educationalcredentialsandoccupationalcompetencies
(4) AFDCrecipients
VocationaltrainingGeneraleducationWorkexperience
Individualjobsearchassistance
GreaterAvenuesforIndependence(GAIN) 1988
24,528-Treatment8,223-Control32751-Total
basiceducation,jobsearchactivities,assessments,skillstraining,andwork
experience.
CaliforniaDepartmentofSocialServices(CDSS)
(1)Participationinemployment-relatedactivities;(2)Earnings;(3)Welfarereceipt;and(4)Employment
(5)Allrecipientsof
ADC(Ohio's
AFDCprogram)
WorkexperienceGeneraleducation
Individualjobsearchassistance JOBS 1989 $3,000,000 24,120-Treatment
4,371-Control
Mandatoryemploymentandtrainingservices,whichincludedbasicandpost-secondaryeducation,communityworkexperience,andjobsearchassistance.
OHDeptofHumanServices (1)Employment;(2)Earnings;and(3)Welfarereceipt
(6)
low-income,
disadvantaged
workersand
jobseekers
Vocationaltraining Individualjobsearchassistance
SectoralEmploymentImpactStudy 2003 1,286-Total
Industry-specifictrainingprogramsthatpreparedunemployedandunderskilledworkersforskilledpositionsandconnectthemwithemployersseekingtofillsuchvacancies.Sectoralprogramsemployvariousapproachesdependingontheorganizationleadingtheeffortandlocal
employers’needs.
CharlesStewartMottFdn. (1)Earnings;(2)Employment;and(3)Qualityofjobs
Table2:DetailsonSelectedRandomizedControlledTrialsofProgramsOfferingJobTrainingandWorkExperienceforLow-IncomeIndividualsintheUnitedStates
(7)low-wageworkers
Vocationaltraining
On-the-jobtrainingCaseManagement
WorkAdvancementand
SupportCenter(WASC)
Demonstration2005
1,176-Dayton
971-SanDiego
705-Bridgeport
2,852-Total
Theprogramofferedparticipating
workersintensiveemploymentretention
andadvancementservices,including
careercoachingandaccesstoskills
training.Italsoofferedthemeasieraccess
toworksupports,inanefforttoincrease
theirincomesintheshortrunandhelp
stabilizetheiremployment.Finally,both
serviceswereofferedinonelocation—in
existingOne-StopCareerCenterscreated
bytheWorkforceInvestmentAct(WIA)of
1998—andbyco-locatedteamsof
workforceandwelfarestaff.
StateofOhio;CountyofSan
DiegoHealthandHuman
ServicesAgency;DOLETA;U.S.
DepartmentofAgriculture,Food
andNutritionService;HHS;
AdministrationforChildrenand
Families;FordFdn.;Rockefeller
Fdn.;AnnieE.CaseyFdn.;David
andLucilePackardFdn.;The
WilliamandFloraHewlettFdn.;
JoyceFdn.;JamesIrvineFdn.;
CharlesStewartMottFdn.;
RobertWoodJohnsonFdn.
(1)Employmentand(2)Earnings
(alongwithmanyotheroutcome
measures)
(8)
schooldropoutsaged17-21years
Generaleducation
VocationaltrainingIndividualjob
searchassistance
JOBSTART 1985 $6,200,000
1,163-Treatment
1,149-Control
2,312-Total
Educationandvocationaltraining,
supportservices,andjobplacement
assistance.
DOL;RockefellerFdn.;FordFdn.;
CharlesStewartMottFdn.;
WilliamandFloraHewlettFdn.;
morefoundations.
(1)Educationalattainment;(2)
Employment;(3)Earnings;and(4)
Welfarereceipt
(9) 16-24yearolds Generaleducation
Vocationaltraining
Healthcare
services
Housingservices
NationalJobCorpsStudy 1994 $21,587,202
9,409-Treatment
5,977-Control
15,386-Total
TreatmentgroupallowedtoenrollinJob
Corpsgroup.JobCorpscentersprovide
vocationaltraining,academicinstruction,
healthcare,socialskillstraining,and
counseling.
DOL,
ETA
(1)Employment;(2)Earnings;(3)
Educationandjobtraining;(4)
Welfarereceipt;(5)Criminal
behavior;(6)Druguse;(7)Health
factors;and(8)Householdstatus
Sources:(1)MDRCBoardofDirectors,1980;(2)Bell,Burstein,andOrr,1987;(3)Bell,Bloom,Cave,Doolittle,andOrr,1994;Bloom,Orr,Bell,Cave,Doolittle,Lin,andBos,1997;(4)Freedman,Friedlander,Riccio,1994;(5)Fein,Beecroft,andBlomquist,1994;
(6)Maguire,Freely,Clymer,Conway,andSchwartz,2010;(7)Miller,VanDok,Tessler,andPennington,2012;(8)Cave,Bos,Doolittle,andToussaint,1993;(9)Burghardt,Schochet,McConnell,Johnson,Gritz,Glazerman,Homrighausen,andJackson,2001.
Abbreviations:DOJ=USDepartmentofJustice;HHS=USDepartmentofHealthandHumanServices;HUD=USDepartmentofHousingandUrbanDevelopment;DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=
Foundation.
TargetPopulation PrimaryIntervention
SecondaryInterventions ExperimentTitle Start
DateCost
(nominal$) SampleSize Treatment FundingSource OutcomesofInterest
(1)
Single-parentheadsofhouseholdwhowererequiredtoparticipateinthe
program(recipientsofAFDC)
JobClub GeneraleducationVocationaltraining
ProjectIndependence--Florida 1990 $3,600,000
13,513-Treatment4,274-Control17,787-Total
TheexperimentalgroupwaseligibletoreceiveProjectIndependenceservicesandwassubjectto
aparticipationmandate.Servicesincludedindependentjobsearch,jobclub,assessment,basiceducation,andtraining.Thecontrolgroupwasnoteligiblefortheseservicesandwasnot
subjecttoaparticipationmandate.
FloridaDepartmentofHealthandRehabilitative
ServicesFordFdn.
USDepartmentofHealthandHumanServices
(1)Employment;(2)Earnings;and(3)AFDCreceipt
(2) Single-parentwelfarerecipients
JobClubCaseManagement
GeneraleducationVocationaltraining
NationalEvaluationofWelfare-to-Work
Strategies(NEWWS)1991 $31,700,000 44,569-Total
Elevenprograms,broadlydefinedaseitheremployment-focusedoreducation-focused,were
testedinsevensitesacrosstheUS.
(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Cost-effectiveness;
and(5)Childwell-being
(3) Familiesonwelfare Individualjobsearchassistance
Earnedincomedisregard
Workexperience
IndianaWelfareReformEvaluation 1995 $23,200,000
63,223-Treatment13,863-Treatment23,217-Control11,091-Control271,394-Total
Experimentalsweresubjectnewwelfarereformpolicies:assistedjobsearch,broadermandatoryworkparticipation,earnedincomedisregard,
timelimitsforcaseassistance,arevisedsystemofchildcareprovision,familybenefitcap,andparentalresponsibility(suchasimmunizingchildren).Controlscontinuedunderthe
traditionalAFDCpolicies
IndianaFamilyandSocialServicesAdministration
USDepartmentofHealthandHumanServices
(1)Employment;(2)Earnings;(3)Welfarereceipt;(4)Income;(5)Healthinsurance;and(6)Parental
responsibility
(4)
Single-parent(AFDC-FG)andtwo-parent(AFDC-U)welfarefamiliesinLosAngelesCounty
JobClubIndividualjobsearch
assistancejobsearchworkshop
LAJobs-FirstGAINEvaluation 1995 $29,900,000
11,521-Treatment14,039-Treatment24,162-Control11,009-Control220,731-Total
MembersofthetreatmentgroupwereenrolledinJobs-FirstGAIN.Thesesubjectswererequiredto
participateinatleastoneofthejobsearchactivities,includingjobclubsandother
informationalservicesandjobsearchtrainingsessions.ExperimentalswerealsoexposedtoJobs-FirstGAIN'sintensivework-firstmessage.Sanctionswereimposed,usuallyintheformofpartialreductionsinwelfarebenefits,forfailuretoparticipate.ControlswerenotexposedtoanyofJobs-FirstGAIN'sservices,theintensivework-firstmessage,orsanctions.Controlscouldstillreceiveassistanceformotheragenciesandwere
subjecttoexistingwelfarerules.
LosAngelesDepartmentofPublicSocialServices
USDepartmentofHealthandHumanServices
FordFdn.
(1)Employment;(2)Earnings;(3)Welfarebenefits;(4)Outcomesforchildren;and(5)IncrementaleffectscomparedwithpreviousLAGAIN
program
(5) UIclaimants Individualjobsearchassistance Vocationaltraining
NevadaClaimantPlacementProgram
(NCPP)1977 3,500
Morestaffattentionandmorereferrals,weeklyinterviewsandeligibilitychecks,allservicesfromsameES/UIteamwhichcoordinatedtheirefforts
(1)Weeksofbenefits;(2)Earnings;(3)Enforcementofworksearchrules;(4)Jobsearches;and(5)Referralsand
placements
Table3:DetailsonSelectedRandomizedControlledTrialsofJobSearchAssistanceProgramsforLow-IncomeIndividualsandUnemployedWorkersintheUnitedStates
(6) UIclaimantsJobsearchincentives
Individualjobsearch
assistance
ClaimantPlacementand
WorkTestDemonstration1983 $225,000
1,485-Treatment1
1,493-Treatment2
1,666-Treatment3
1,277-Treatment4
Jobsearchandplacementservices
USDepartmentofHealthand
HumanServices
FordFdn.
(1)Employmentand(2)UIpayments
reductions
(7)
UIclaimantsindefinitely
separatedfrommostrecentjob
Individualjobsearch
assistance
WisconsinEligibility
ReviewPilotProject(ERP)1983 5000
6-hourjobsearchworkshopconductedbyES
staff;alsotried3-hourjobsearchworkshop
(1)Weeksofbenefits;(2)Earnings;
(3)Enforcementofworksearchrules;
(4)Jobsearches;and(5)Referralsand
placements
(8) Unemployed
Casemanagement
Individualjobsearch
assistance
Jobsearchworkshop
ReemployMinnesota
(REM)1988 $835,000
4,212-Treatment
unknown-Control
(roughly10times
treatment)
Morepersonalizedandintensiveunemployment
insurance(UI)services,includingcase
management,intensivejobsearchassistanceand
jobmatching,claimanttargetingforspecial
assistance,andajob-seekingskillsseminar.The
controlgroupreceivedregularUIservices.
UnemploymentInsurance
ContingentAccountofthe
MinnesotaDepartmentof
JobsandTraining
(1)DurationofUIbenefitsand(2)
AmountofUIbenefits
(9) UIclaimants Individualjobsearch
assistanceVocationaltraining
KentuckyWorkerProfiling
andReemployment
Services(WPRS)
Experiment
1994 $15,000
1,236-Treatment
745-Control
1,981-Total
Structuredjobsearchactivities,employment
counseling,andretraining
KentuckyDepartmentof
EmploymentServices
(1)Earnings;(2)Lengthofbenefit
receipt;and(3)AmountofUIbenefits
received
(10) UIclaimants Alternativework
searchpolicies
MarylandUnemployment
InsuranceWorkSearch
Demonstration
1994 $250,000
3,510-Treatment1
3,455-Treatment2
3,680-Treatment3
3,400-Treatment4
4,812-Control1
4,901-Control2
23,758-Total
4differentruleschangestoMarylandUI
eligibilityrulesUSDOLETA
(1)UIpaymentsintermsofweeksand
dollars;(2)Continuingeligibility;(3)
Employment;and(4)Earnings
(11) UIclaimantsIndividualjobsearch
assistanceCasemanagement
VocationaltrainingReemploymentandEligibilityAssessment
(REA)2013
(1)CurrentREAProgram:assistance--definedastheprovisionoflabormarketinformation,
developinganindividualreemploymentplan,areferraltoreemploymentservices,anddirect
provisionofreemploymentservices+enforcement(seebelow)
(2)EnforcementOnly:therequirementthatclaimantsappearfortheREAmeetingandthatREAprogramstaffverifyclaimants’eligibilityandtheirparticipationinworksearchactivities,withreferraltoadjudicationandpossiblesuspensionofUIbenefitsforthosewhodonotparticipate
USDOLETA(1)UIbenefitreceipt;(2)
Employment;and(3)Earnings
Sources:(1)Kemple,Friedlander,andFellerath,1995;(2)Hamilton,Freedman,Gennetian,Michalopoulos,andWalter,2001;(3)Beecroft,Lee,Long,Holcomb,Thomson,Pindus,O'Brien,andBernestin,2003;(4)Freedman,Knab,Gennetian,andNavarro,2000;(5)Steinman,1978;(6)Johnson,Pfiester,West,andDickinson,1984;Corson,Long,andNicholson,1984;(7)HerremandSchmidt,1983;Jaggers,1984(8)MinnesotaDepartmentofJobsandTraining,1990;(9)Black,Smith,Berger,andNoel,2003;(10)Klepinger,Johnson,Joesch,andBenus,1997;(11)Klerman,Minzner,Harkness,Mills,Cook,andSavidge-Wilkins,2013.
Abbreviations:DOL=USDepartmentofLabor;ETA=EmploymentandTrainingAdministration;Fdn.=Foundation.