second-language strategy instruction: where do we go from...
TRANSCRIPT
Second-languagestrategyinstruction:Wheredowegofromhere?
LukePlonskyNorthernArizonaUniversity
SituatingStrategyUse(SSU3)
October15,2019
Language
LearnerEXTERNALmechanisms/processes LearnerINTERNALmechanisms/processes
ModelingL2development
?
input
instructionenvironment
medium
Instruct.context
context
motivation
workingmemory
beliefsaboutlanguagelearning
aptitude
anxiety
intelligence
emotions
personality
L2strategiesATI
+/-amount
type
explicit
implicit
SL
FL
SA
at-home
class
lab
immersion
tech
F2F
2
stress relativeclauses
complexity FSs
Butwhichofthesecanweinfluence?
Probably,butverylittleresearch
0.490.48
0.370.36
0.310.26
0.22
0 0.1 0.2 0.3 0.4 0.5 0.6
AptitudeWTC
MotivationAnxiety
StrategiesWM
L2MSS
HowstronglyaretheseconstructsassociatedwithL2proficiency/achievement(asshownviameta-analyses)?
Individual differences (learner-internal variables associated with L2 development)
Meta-analyticcorrelations(r)
Probablynot
Probably,butverylittleresearch
Probably,butverylittleresearch
Probablynot
Yes!
Li(2016);ElahiShirvan,Khajavy,MacIntyre,&Taherian(inpress);Masgoret&Gardner(2003);Teimouri,Goetze,&Plonsky(2019);Linck,Osthus,Koeth,&Bunting(2014);Al-Hoorie(2018).
StrategyInstruction(L2SI)
Def.Explicittrainingonspecificpracticesortechniquesthatcanbeemployedautonomouslytoimproveone’sL2learningand/oruse(Chen,2007;Ellis&Sinclair,1989;Tudor,1996;Tayloretal.,2006).
(Discussedinover400empirical,theoretical,andreviewarticleandbooks(seehere)
Outline
PartI:Stateofthescience/substantivefindingsofSI-TheWHAT
PartII:Methodologicalissues-TheHOW
PartIII:Recommendations
What do we know about the effects of L2SI? PartI
StrategyInstruction
• Intuitiveappeal• Theoreticalsupport• Strategiccompetence(e.g.,Canale&Swain,1980)• Learner-centeredness(Nunan,1988;Tudor,1996)• Developmentalsequences(i.e.,ratevs.route)• Autonomy/self-regulation/self-management(Gu,2003;Rubin,2005;Tseng,Dörnyei,&Schmitt,2006)
“Teachersshouldnotfocusexclusivelyonthecontentoflearning.Instead,attentionshouldalsobegiventotheprocess.For,tobeself-sufficient,learnersmustknowhowtolearn.”
From,TowardaTheoryofInstruction(Bruner,1966)
CritiquesofStrategyInstruction&SIResearch
• Poordesign(e.g.,smallsamplesizes,non-randomgroupassignment,exclusionofcomparisongroups)• Unjustifiedselectionofstrategies• Uncertaintyoflong-termeffects• Lackofvalidandreliableinstruments• Incompletereportingoftreatmentsandresults• Absenceofacomprehensivetheory(C’monSLAfolks!)• Cost/benefitratioconcerns“…whatonemustteachstudentsofalanguageisnotstrategy,butlanguage”
(Bialystok,1990,p.147).
(Chamot,2005;Dörnyei,1995;Kellerman,1991;Macaro&Erler,2007;McDonough,1995;Macaro&Cohen,2007;Rees-Miller,1993;Roseetal.,2018)
ReviewingStrategyInstruction(Chamot,2005;Hassanetal.,2005;McDonough,1995)
PositiveeffectsforSI…• Contexts• secondlanguage,foreignlanguage• middleschool,HS,university• Children,adults• beginner,intermediate,advanced• class,lab
• Treatments• Strategiestype:cognitive,metacognitive,socioaffective• Numberofstrategies:1-99• Short-,long-term:1day-1year• L1,L2;teacher-orresearcher-delivered
• Outcomes• L2skills:reading,writing,listening,speaking,vocabulary,grammar• others:autonomy,motivation,strategiesuse,generallanguageability
Negative/mixedeffectsforSI…• Contexts• secondlanguage,foreignlanguage• middleschool,HS,university• Children,adults• beginner,intermediate,advanced• class,lab
• Treatments• Strategiestype:cognitive,metacognitive,socioaffective• Numberofstrategies:1-99• Short-,long-term:1day-1year• L1,L2;teacher-orresearcher-delivered
• Outcomes• L2skills:reading,writing,listening,speaking,vocabulary,grammar• others:autonomy,motivation,strategiesuse,generallanguageability
ReviewingStrategyInstruction(Chamot,2005;Hassanetal.,2005;McDonough,1995)
Plonsky(2019)
Ameta-analysisoftheeffectsofL2SIRQs:1.HoweffectiveisL2strategyinstruction?2.WhatistherelationshipbetweentheeffectivenessofSIanddifferentlearningcontexts,treatments,andoutcomevariables(e.g.,skillareas)?
First, what is meta-analysis?
• Empiricalapproachtoreviewingliterature• Moresystematicandobjectivethantraditionalreviews• Origin?(“Necessityisthemotherofinvention”)
Assumption:Developingscientific
knowledgeisacumulativeand
corporateenterprise.
First, what is meta-analysis?
• THREEhallmarks(Mizumoto,Plonsky,&Egbert,inpress;Plonsky&Oswald,2015)
1.Exhaustive(vs.selective)searches(sample≈population)• àValiditygeneralizability
2.Systematiccodingforsubstantivefeaturesandeffects(vs.subjectivelyoridiosyncraticallyinterpreted)
3.Keycomponent:effectsizes(e.g.,d,r)moreprecise,stable,intuitive,andinformative(vs.p)
• Consequently…
Assumption:Developingscientific
knowledgeisacumulativeand
corporateenterprise.
Meta-analyses provide stable, trustworthy answers!
• Q:Doestextualenhancementwork?• A:YES,buttheeffectsarefairlysmall(d=.22);andithelpsforgrammarlearningbutmightimpedetextcomprehension(Lee&Huang,2008;K=20)
• Q:Iscomputer-basedfeedbackhelpful?• A:Yes!Justashelpfulormoresothanface-to-facefeedback(Ziegler,2013;K=14).
• Q:Isithelpfultoprovidestudentswithfeedbackwhentheymakeerrorsinclass?• A:YES,butitdependsonwhattypeoffeedbackyouprovide
• WhataboutSI?
Lyster&Saito(2010)
Ameta-analysisoftheeffectsofL2SI(Plonsky,2019)
RQs:1.HoweffectiveisL2strategyinstruction?2.WhatistherelationshipbetweentheeffectivenessofSIanddifferentlearningcontexts,treatments,andoutcomevariables(e.g.,thefourskills)?
Method–Inclusioncriteria• ParticipantslearninganL2• TreatmentthatincludedL2strategyinstruction• Datacollectedandcomparedinacontrol-experimental(betweengroups)design• DV=quantitativemeasureoftheeffectofSI• Sufficientdatareportedtocalculateaneffectsize(Cohen’sd)
Method-Sample• 77primarystudiesoftheeffectivenessofSI• 112uniquesamples/treatmentgroups• 7,890individualparticipants
Method–datacollectionandanalysis
• Codedfor…(a)substantiveand(b)methodologicalfeaturesaswellas(c)estimatesoftreatmenteffects(Cohen’sd).• Analysis• RQ1:Weightedaverageoverall• RQ2:Weightedaverageforsubgroupscreatedaccordingtostudyfeatures(i.e.,potentialmoderators)
Results:RQ1• Overalleffectsize:d=0.66[.62,.69]
• Whatdoesthismean?• RelativetoSLA:“medium”
EffectSize
Small-ish25th
percentile
Medium-ish50th
percentile
Large-ish75th
percentiled .40 .70 1.00
K = 346 primary studies and 91 meta-analyses of L2 research (N > 604,000)
(Plonsky & Oswald, 2014)
Results:RQ1• Overalleffectsize:d=0.66[.62,.69]
• Whatdoesthismean?• RelativetoSLA:“medium”• RelativetoL1SI:d=.45(Hattieetal.,1996)• Expgroupsscoreonaverage2/3ofanSDabovecontrolgroups• Approximately3/4ofEGparticipantsoutperformaverageCGparticipants
(Lipseyetal.,2012)• Additionalandpracticalconsiderationsforinterpretation
• Teachertraining• Materialsdevelopment• Classtime(cost/benefitratio?)• Potentialforlong-termbenefit?
Results:RQ1àchangeovertime?• Overalleffectsize:d=0.66[.62,.69]
0.43
0.93
0 0.2 0.4 0.6 0.8 1 1.2
1980-2005(k=70)
2006-2015(k=42)
Effectsize(d)
Twopossibleexplanations-“anotablygreaterstandardizationofinterventionframeworkshasgraduallyemergedinthepastdecade”(Ardashevaetal.,2017)-Methodological(vs.theoretical?)maturity(Plonsky&Gass,2011;Plonsky&Oswald,2014)
RQ2:EffectsofSIAcrossLearningContexts
0.84
0.57
0.77
0.26
0.69
0.55
0.82
0.39
0.74
0 0.2 0.4 0.6 0.8 1
L2(k=13)
FL(k=99)
Primary(k=11)
Secondary(k=28)
University(k=70)
Class(k=78)
Lab(k=32)
Beginner(k=44)
Inter/Adv(k=57)
Context
Institu
tion
Setting
Proficiency
Effectsize(d)
RQ2:EffectsofSIAcrossTreatmentTypes
0.56
1
0.49
0.65
0.86
0.58
0 0.2 0.4 0.6 0.8 1 1.2
Cognitive(k=92)
Metacognitive(k=35)
≤2weeks(k=50)
>2weeks(k=57)
1(k=41)
>1(k=48)
Type
Length
#ofstrategies
Effectsize(d)
1.11
1
0.82
0.63
0.59
0.06
2.07
0.75
0.05
-0.5 0 0.5 1 1.5 2 2.5
Strategyuse(k=10)
Speaking(k=13)
Reading(k=41)
Vocab(k=33)
Writing(k=8)
Listening(k=10)
Pronunciation(k=2)
Grammar(k=4)
General(k=5)
Effectsize(d)
RQ2:EffectsofSIAcrossL2SkillAreas
Ardasheva, Wang, Adesope, & Valentine (2017)
• Meta-analysisoftheeffectsofL2SIon• RQ1:L2performance• RQ2:Otherself-regulatedoutcomes(e.g.,anxiety,self-efficacy,attitudes)
• 2008-2014only
• Sample• RQ1:39reports(47samples)• RQ2:16reports(17samples)
Ardasheva, Wang, Adesope, & Valentine (2017)
OverallResults
0.78
0.87
0 0.2 0.4 0.6 0.8 1
Language(k=43)
SRlearning(k=17)
Ardasheva, Wang, Adesope, & Valentine (2017)
Results,RQ1(linguisticoutcomes)
0.78
1.23
0.76
0.68
0.62
0.61
0.47
0.13
0 0.2 0.4 0.6 0.8 1 1.2 1.4
All(K=43)
Vocab(k=4)
Reading(k=20)
Listening(k=9)
General(k=2)
Speaking(k=2)
Writing(k=7)
Grammar(k=2)
1.111
0.820.630.590.06
2.070.75
0.05
-1 0 1 2 3
Strategyuse(k=10)Speaking(k=13)Reading(k=41)Vocab(k=33)Writing(k=8)
Listening(k=10)Pronunciation(k=2)
Grammar(k=4)General(k=5)
0.87
1.26
0.98
0.9
0.54
0.27
0 0.2 0.4 0.6 0.8 1 1.2 1.4
All(K=17)
Strategyeffectiveness(k=20
Strategyuse(k=11)
Anxiety(k=1)
Attitudes(k=2)
Self-efficacy(k=2)
Ardasheva, Wang, Adesope, & Valentine (2017)
Results,RQ2(non-linguisticoutcomes)
Additional meta-analytic evidence for strategies
• Englishlearning(overall)(ElahiShirvan,2014)
• Reading(Chaury,2015;Maeng,2014;Tayloretal.,2006)
• VocabularylearningstrategiesforEFLlearners(Nematollahietal.,2017)
• Web-basedinstruction(Chang&Lin,2013)
PreliminaryImplicationsandDiscussion
SIcanbeeffectiveinallcontextsandforallskillsbutappearstobestronger:(a)withnon-beginners(“threshold”inChamot,2016;“therichgetricher”?)
(b)withmetacognitivestrategies(c)overlongerperiodsoftime,and(d)fewertargetstrategies(i.e.,lessismore)
BUTAgreatdealoffurtherresearchisstillneededacross…- Learnerdemographicsandcontexts- Linguistic(i.e.,skills)andnon-linguisticdomains(e.g.,anxiety)- Individualstrategies
The HOW (SI Methods) PartII
We have some issues
• Design&Instrumentation(seee.g.,Pawlak,2019;Roseetal.,2018)• Smallsamples• Lackofdelayedposttests• Lackoftheoreticalorempiricaljustificationofstrategiestaught• Evidenceofreliability(internalconsistency)andvalidityoftenunknownBOTHformeasuresofstrategiesANDL2performance!
“Youcan’tfixwithanalysiswhatyoubungledbydesign”(Lightetal.,1990)Noanalysis—howeversophisticatedorelegant—canmakeupforpoorinstrumentation.
At least we’re not alone?
• True.Theseproblemsarepervasivethroughoutprettymuchallofappliedlinguistics(andthroughoutthesocialsciences)!
Reliability evidence O(observation)=T(truescore)+E(error)
Reporting of reliability across domains of L2 research
6 7
16 20
28 37 38
41 43 45 46 47
50 59
64 66
0 20 40 60 80 100
Nekrasova & Becker (2009) Mackey & Goo (2007)
Norris & Ortega (2000) Russell & Spada (2006)
Derrick (2016) Brown (2016)
Jeon & Kaya (2006) Plonsky (2011) Ziegler (2013) Plonsky (2013)
Adesope et al. (2010) Lee, Jang, & Plonsky (2015)
Adesope et al. (2011) Plonsky & Kim (2016)
Plonsky & Gass (2011) Liu & Brown (2015)
EffectsofL2practice
WCF
What about the amount of (measurement) ERROR? What’s typical for the field?
• Reliabilitygeneralizationmeta-analysis(RGM)(Plonsky&Derrick,2016)• K=537from16L2journals• 2,244reliabilityestimates
0.82
0.92
0.95
0.7 0.75 0.8 0.85 0.9 0.95 1
Instrument
Interrater
Intrarater
[.74-.89]
[.83-.96]
[.90-.96]
(K=1,323)
(K=861)
(K=40)
25thPercentile75thPercentile
In other domains of L2 research?
• TBLT(K=85;Plonsky&Kim,2016)
rel.= 0.93 0.87 0.86 0.76 N/A N/A
In other domains of L2 research?
• L2pronunciation(K=77;Saito&Plonsky,2019)
What about for L2 strategies?
• Orevenfordifferentcategoriesofstrategies?
• Wereallydon’tknow!
• Whydoesthismatter?• UnreliabilityàErroràThreattovalidity• Attenuationofeffects(signalvs.noise)
Validity Evidence O(observation)=T(truescore)+E(error)
What we say about validity
• Chapelle(inpress):“validationshouldbeofcentralimportanceforthecredibilityofresearchresults”
• TQAuthorGuidelines:Authorsshouldprovidea“Descriptionoftheinstruments,whattheyaredesignedtomeasure,andareportoftheirvaliditytotheextentpossible,andtheirreliability.”
• Ellis(inpress):“Whileresearchershavealwaysrecognizedthisissue[validityinSLAmeasurement],theyhavelargelyignoredit,oftenhappytotalkaboutlearningwithnoconsiderationofthetypeofdatatheyhadcollected”
• Norris&Ortega(2012):”Problematic…isthetendencytoassume—ratherthanbuildanempiricalcasefor—thevalidityforwhateverassessmentmethodisadopted(pp.574-575).
• Schmitt(2019):“Mostvocabularytestsarenotvalidatedtoanygreatdegree.”
Whataboutstrategyscales???SeeseminalworksbyCronbach&Meehl;Messick;Kane;Chapelle,Enright,&Jamieson
Are questionnaires to blame? (Great examples of alternatives in Gu’s plenary and Yashima & MacIntyre’s symposium)
• 1.Indirectmeasuresoftheconstructofinterest• Suggestion:triangulation(e.g.,+observations;+interview)
• 2.Responsesoftenlimitedtowhatisbeingasked• Suggestions:piloting;open-endeditems;interviews
• 3.Self-selectionbias• Suggestions:randomorpurposivesampling;missingdataanalysis
• 4.Anonymityà+/-truthfulness?• Suggestion:triangulation
• 5.Responsevalues?(3,5,9,1,000?)• Suggestions:piloting;clearinstructions;scaledescriptors
• 6.Quantificationwithoutconsiderationofnumericalvalues• Suggestion:richqualitativedata;betteruseofstats
• 7.Ambiguous(“double-barreled”)items• Suggestion:Pilot.Leaveroomforcomments.
Potentialthreatstovalidity
Validity = multifaceted
Construct Predictive
Convergent/concurrentDiscriminant
/Divergent
Face
To what extent does L2 research demonstrate an explicit concern for (different facets of) validity?
“ThereisperhapsanunwrittenagreementthatreaderswillacceptmeasuresusedinanSLAstudyatfacevaluewithoutaskingabouttheirreliabilityandvalidityforthetaskathand.”(Cohen&Macaro,2013,p.133;seeBachman&Cohen,1998).
• Isthistrueingeneral?• Andforstrategiesresearch?• Doyoueverseevalidityevidence?
To what extent does L2 research demonstrate an explicit concern for (different facets of) validity?
• HowcouldweaddressthisQ?• Collectarepresentativesampleofstudies…
• Syntheticapproach• Verytime-consuming• Subjecttohighinferencejudgments
• Corpus-basedapproach• Fastandobjective• Valid?
To what extent does L2 research demonstrate an explicit concern for (different facets of) validity?
• SecondLanguageResearchCorpus(L2RC;Plonsky,n.d.)• 22journals• 22,363articles(1946-2018)• 147,293,764words
• Searchedforoccurrencesof:-[predictive,discriminant,divergent,construct,face,convergent,concurrent]+validity-validityargument
AL,ALL,AP,BLC,CMLR,ELTJ,FLA,IJAL,IRAL,JSLW,LAQ,LA,LL,LL&T,LTeaching,LTR,LTesting,MLJ,SLR,SSLA,System,TQ
To what extent does L2 research demonstrate an explicit concern for (different facets of) validity?
Notescale
1.93
0.71
0.54
0.52
0.24
0.19
0.18
0.10
0 2 4 6 8 10
Construct
Face
Preditive
Concurrent
Discriminant
VArgument
Convergent
Divergent
Howmightthestrategiesliteraturecompare???
2inevery100articles
1inevery1,000articles
(Whataboutfalsepositives?Falsenegatives?)
It’s not all bad! • Nakatani(2006):Scaledevelopment/validation
• DevelopmentoftheOralCommunicationStrategyInventory(OCSI)• Stage1:Open-endedquestionnaire(N=80)• Stage2:Pilotedwith400à(exploratory)factoranalysis(itemstructure)à8categoriesforspeakingand7forlisteningstrategies• Stage3:ComparedwithdatafromSILL(N=62)
• Mizumoto&Takeuchi(2012):Scalevalidation• Self-regulatingCapacityinVocabularyLearningScale(SRCvoc)• Study1:N=443àitem-analysis:ITCof>.4;alphaforsubscales• EFAtoexaminefactorstructure• Study2:N=914àalphaforsubscales;CFA
• Ardasheva&Tretter(2013):ValidationofmodifiedversionofSILL• Revisionofitemsandpiloting• Administeredto1057childlearnersofESL• CFAà6factorsolution
It’s not all bad!
• Seealso• Tragantetal.(2013)• Ardasheva(2016)• Teng&Zhang(2016)
Summary for Part II
• Whatweneedisanon-casual,rigorous,andsystematicagendafocusedonmeasurementasitpertainstoL2strategiesandstrategyinstruction.
Looking ahead à L2SI research wish list (see Sudina & Plonsky, in press)
What/substance• SIacrossallskillareas.Esp:writing,listening,pronunciation,test-taking
• Aptitude-treatmentinteractionswithL2SI(e.g.,withbeliefs,workingmemory;seeYashima,Nishida,&Mizumoto,2017)
• SIforspecificlearningcontexts:SA,CALL/MALL,EMI/CLIL
• Teachertraining• StudiesofteacherbeliefsregardingSIand• EffectivenessofteachertraininginterventionsforSI
• Theroleofstrategictransfer(L1àL2;L2àLn)
Looking ahead à L2SI research wish list (see Sudina & Plonsky, in press)
How/Method:Designs
• Non-”WEIRD”samples:e.g.,SL,pre-adolescent,advancedlearners
• Validityevidence/argumentsfortheutilityofindividualstrategiesßessentialjustificationforL2SIstudiesbutRARELYpresent
• Aclearerunderstandingofthelong-termeffectsofSI
• “Bigger”andmorelongitudinaldesigns—atthecurricularlevel
Looking ahead à L2SI research wish list (see Sudina & Plonsky, in press)
How/Method:Measurement
• Validityargumentsformeasuresofboth(a)strategyusage(Takeuchi,2019;Tsengetal.,2006)and-Situated,qualitative,andmixedmethods(Pawlak&Oxford,2018;Roseetal.,2018)-Macro+microperspective(Pawlak,inpress)-Scenario-basedscalesà+contextualization(seeTeimouri,2018)-StudiesofthepredictivevalidityofindividualstrategiesandL2performance(asapre-requisiteforSI)(b)L2performance…alltothembemadeavailableontheIRISdatabaseiris-database.orgà+consistencyacrossstudies!!
Looking ahead à L2SI research wish list (see Sudina & Plonsky, in press)
How/Method:Datareportandanalyses• Morethoroughreportingof
• Samplecharacteristics(e.g.,proficiency)• Treatments(e.g.,length/intensity,materials)• Data(ESs,CIs,visuals,reliabilitycoefficients)
• Moreinformeduseofquantitativeanalyses(Mizumoto&Plonsky,2015;Nix,2018;Takeuchi,2019)• E.g.,Raschanalysis;Multivariatemodels;correctionsforattenuationduetomeasurementerror
Looking ahead à L2SI research wish list (see Sudina & Plonsky, in press)
How/Method:Beyondindividualstudies• Replicationstudies!• Additionalmeta-analysesofSIfocusedonindividualstrategiesorskills(e.g.,vocab,speaking)• Systematicreviewandmeta-analysisofreliabilitycoefficients(what’snormal?)
Thankyou!LukePlonskylukeplonsky@gmail.comlukeplonsky.wordpress.com