neurocog - interplay of bigram frequency and orthographic...

19
Bilingualism: Language and Cognition 19 (3), 2016, 578–596 C Cambridge University Press 2015 doi:10.1017/S1366728915000292 Interplay of bigram frequency and orthographic neighborhood statistics in language membership decision YULIA OGANIAN Department of Education and Psychology, Freie Universitaet Berlin, Germany Bernstein Center for Computational Neuroscience, Humboldt-Universität zu Berlin, Germany MARKUS CONRAD Department of Education and Psychology, Freie Universitaet Berlin, Germany Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department of Education and Psychology, Freie Universitaet Berlin, Germany HAUKE R. HEEKEREN Department of Education and Psychology, Freie Universitaet Berlin, Germany Bernstein Center for Computational Neuroscience, Humboldt-Universität zu Berlin, Germany KATHARINA SPALEK Institut für deutsche Sprache und Linguistik, Humboldt Universitaet zu Berlin, Germany (Received: October 23, 2014; final revision received: April 29, 2015; accepted: April 30, 2015; first published online 2 July 2015) Language-specific orthography (i.e., letters or bigrams that exist in only one language) is known to facilitate language membership recognition. Yet the contribution of continuous sublexical and lexical statistics to language membership decisions during visual word processing is unknown. Here, we used pseudo-words to investigate whether continuous sublexical and lexical statistics bias explicit language decisions (Experiment 1) and language attribution during naming (Experiment 2). We also asked whether continuous statistics would have an effect in the presence of orthographic markers. Language attribution in both experiments was influenced by lexical neighborhood size differences between languages, even in presence of orthographic markers. Sublexical frequencies of occurrence affected reaction times only for unmarked pseudo-words in both experiments, with greater effects in naming. Our results indicate that bilinguals rely on continuous language-specific statistics at sublexical and lexical levels to infer language membership. Implications are discussed with respect to models of bilingual visual word recognition. Keywords: bilingualism, language membership decision, naming, sublexical units, lexical neighborhood, statistical learning, orthographic marker Introduction The ultimate aim of language comprehension is the extraction of a coherent and meaningful percept from a stream of noisy and often ambiguous input. In the bilingual case, the input can potentially stem from two different languages. This adds a level of complexity to language comprehension, since the two languages rely on different, and often contradicting, linguistic We thank Frederike Albers and Ulrike Schlickeiser for assistance in data collection and analysis of the voice files, Carsten Schliewe for technical assistance, and Assaf Breska for fruitful discussions of the manuscript. This research was supported by the Deutsche Forschungsgemeinschaft (GRK1589/1, doctoral scholarship to YO). Address for correspondence: Yulia Oganian, Freie Universitaet Berlin, Department of Education and Psychology, Habelschwerter Allee 45, JK25/215, 14195 Berlin, Germany [email protected] Supplementary material can be found online at http://dx.doi.org/10.1017/S1366728915000292 structures. Thus, early recognition of the language of a word is potentially advantageous for language comprehension in several ways. First, it allows narrowing down possible interpretations of preceding and following inputs to those that fit the language of the current input at the single word as well as at the sentential level (Libben & Titone, 2009). Second, knowing the language of an input facilitates its recognition itself by narrowing lexical search to one language only (Casaponsa, Carreiras & Duñabeitia, 2014; Schulpen, Dijkstra, Schriefers & Hasper, 2003). The extent to which this is possible is of high importance for models of bilingual word recognition, which assume that lexical access is fundamentally language-unselective, meaning http://dx.doi.org/10.1017/S1366728915000292 Downloaded from http:/www.cambridge.org/core. Universidad de la Laguna, on 09 Oct 2016 at 09:47:19, subject to the Cambridge Core terms of use, available at http:/www.cambridge.org/core/terms.

Upload: others

Post on 01-Nov-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Bilingualism Language and Cognition 19 (3) 2016 578ndash596 Ccopy Cambridge University Press 2015 doi101017S1366728915000292

Interplay of bigram frequencyand orthographicneighborhood statistics inlanguage membership decisionlowast

Y U L I A O G A N I A NDepartment of Education and Psychology Freie UniversitaetBerlin GermanyBernstein Center for Computational NeuroscienceHumboldt-Universitaumlt zu Berlin GermanyM A R K U S C O N R A DDepartment of Education and Psychology Freie UniversitaetBerlin GermanyUniversidad de La Laguna Tenerife SpainA R A S H A RYA N IDepartment of Education and Psychology Freie UniversitaetBerlin GermanyH AU K E R H E E K E R E NDepartment of Education and Psychology Freie UniversitaetBerlin GermanyBernstein Center for Computational NeuroscienceHumboldt-Universitaumlt zu Berlin GermanyK AT H A R I NA S PA L E KInstitut fuumlr deutsche Sprache und Linguistik HumboldtUniversitaet zu Berlin Germany

(Received October 23 2014 final revision received April 29 2015 accepted April 30 2015 first published online 2 July 2015)

Language-specific orthography (ie letters or bigrams that exist in only one language) is known to facilitate languagemembership recognition Yet the contribution of continuous sublexical and lexical statistics to language membershipdecisions during visual word processing is unknown Here we used pseudo-words to investigate whether continuoussublexical and lexical statistics bias explicit language decisions (Experiment 1) and language attribution during naming(Experiment 2) We also asked whether continuous statistics would have an effect in the presence of orthographic markersLanguage attribution in both experiments was influenced by lexical neighborhood size differences between languages evenin presence of orthographic markers Sublexical frequencies of occurrence affected reaction times only for unmarkedpseudo-words in both experiments with greater effects in naming Our results indicate that bilinguals rely on continuouslanguage-specific statistics at sublexical and lexical levels to infer language membership Implications are discussed withrespect to models of bilingual visual word recognition

Keywords bilingualism language membership decision naming sublexical units lexical neighborhood statistical learningorthographic marker

Introduction

The ultimate aim of language comprehension is theextraction of a coherent and meaningful percept froma stream of noisy and often ambiguous input In thebilingual case the input can potentially stem from twodifferent languages This adds a level of complexityto language comprehension since the two languagesrely on different and often contradicting linguistic

lowast We thank Frederike Albers and Ulrike Schlickeiser for assistancein data collection and analysis of the voice files Carsten Schliewefor technical assistance and Assaf Breska for fruitful discussionsof the manuscript This research was supported by the DeutscheForschungsgemeinschaft (GRK15891 doctoral scholarship to YO)

Address for correspondenceYulia Oganian Freie Universitaet Berlin Department of Education and Psychology Habelschwerter Allee 45 JK25215 14195 Berlin Germanyyuliaoganianfu-berlindeSupplementary material can be found online at httpdxdoiorg101017S1366728915000292

structures Thus early recognition of the languageof a word is potentially advantageous for languagecomprehension in several ways First it allows narrowingdown possible interpretations of preceding and followinginputs to those that fit the language of the currentinput at the single word as well as at the sententiallevel (Libben amp Titone 2009) Second knowing thelanguage of an input facilitates its recognition itselfby narrowing lexical search to one language only(Casaponsa Carreiras amp Duntildeabeitia 2014 SchulpenDijkstra Schriefers amp Hasper 2003) The extent to whichthis is possible is of high importance for models ofbilingual word recognition which assume that lexicalaccess is fundamentally language-unselective meaning

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 579

that any visual input activates orthographic lexical andphonological representations in both languages (ConradAlvarez Afonso amp Jacobs 2014 De Groot Delmaaramp Lupker 2000 Dijkstra Hilberink-Schulpen amp vanHeuven 2010 but see also Costa La Heij amp Navarrete2006)

In natural settings language membership informationcan be inferred from a variety of cues external tothe word itself such as sentential context or priorinformation about the speaker Even in the absenceof such external cues however bilinguals are ableto reliably identify the language of a single word(Casaponsa et al 2014 Grainger amp Dijkstra 1992 Vaidamp Frenck-Mestre 2002 van Kesteren Dijkstra amp deSmedt 2012) For example a trivial source for languagemembership information is the lexical word form itselfwhich ndash unless a cognate ndash is unambiguously associatedwith a certain language (Grainger amp Dijkstra 1992)Differences between orthographic scripts (ie EnglishndashChinese) or unique graphemes (ie the letter Oslash inNorwegian vs English van Kesteren et al 2012) canalso cue language membership Even in the case ofa shared orthographic system two-letter combinations(bigrams) that are orthographically legal in only oneof two languages (ie TX in Basque vs SpanishCasaponsa et al 2014 OE in French vs English Vaidamp Frenck-Mestre 2002) termed orthographic markerscan speed language categorization and lexical accessA recent extension of the most prominent cognitivemodel of visual word recognition the bilingual interactiveactivation model plus (BIA+ van Kesteren et al 2012)incorporates the effects of orthographic markers onlanguage membership decisions through inclusion ofseparate sublexical and lexical language nodes whichrepresent the language membership of language-uniqueunits ie orthographic markers and lexical word-formsrespectively Importantly in this model the languagemembership of a letter string can be identified basedon sublexical information alone This is in line with thesuggestion that orthographic markers allow for languageidentification without processing of the complete letterstring or access to lexical representations (Vaid amp Frenck-Mestre 2002) However the amount of lexical activationduring language decisions has not been explicitly tested

The evidence reviewed so far focuses exclusively ondichotomic language membership cues such as languagemembership of lexical word forms and orthographicmarkers which exist in one language only Howeverample evidence suggests that the visual word processingsystem is sensitive to continuous statistical patterns inthe native language For instance lexical orthographicneighborhood size a measure of lexical co-activationduring lexical search (Andrews 1997) is negativelycorrelated with RT on a range of tasks such as lexicaldecision and naming (Carreiras Perea amp Grainger 1997

for a review see Andrews 1997) Similarly a largebody of findings shows that the frequency of sublexicalunits modulates word processing in different languages(syllables Carreiras Alvarez amp de Vega 1993 ConradCarreiras Tamm amp Jacobs 2009 Conrad Graingeramp Jacobs 2007 Conrad amp Jacobs 2004 morphemesDeutsch Frost Pollatsek amp Rainer 2000 Frost KuglerDeutsch amp Forster 2005 bigrams Westbury amp Buchanan2002) Furthermore Bailey and Hahn (2001) concurrentlymanipulated word-likeness at sublexical and lexicallevels and demonstrated a unique contribution of eachof the two levels to word-likeness ratings of visuallyand auditory presented pseudo-words They found acorrelation between word-likeness ratings and lexicalneighborhood sizes as well as with sublexical bigramfrequencies

In bilinguals differential effects of within- andbetween-language orthographic neighborhood sizes onword processing were found (De Groot BorgwaldtBos amp van den Eijnden 2002 Lemhoumlfer DijkstraSchriefers Baayen Grainger amp Zwitserlood 2008) Forexample Lemhoumlfer and colleagues (2008) showed that ina progressive demasking task within-language neighborshave stronger effects on word recognition than between-language neighbors Moreover Conrad et al (2014)recently found that bilingual visual word recognition issensitive to the frequencies of syllabic units in bothlanguages In particular processing of sublexical units infirst and second language (L1 and L2 respectively) seemsto benefit from sublexical frequencies in the respectivenon-presented language

Given monolingualsrsquo ability to use statistical linguisticinformation and bilingualsrsquo sensitivity to linguisticstructure of both of their languages the question ariseswhether bilinguals can use statistical information forlanguage membership identification Namely differencesin frequencies of occurrence between languages couldbe used as cues to determine language membershipThis is only possible if bilinguals are able to assesssublexical (ie bigram frequencies) as well as lexical (ieorthographic neighborhood sizes) statistics separately forL1 and L2 The alternative would be that they representstatistical information in a way that summarizes acrosslanguages in which case statistical information could notbe used in language membership decisions For examplein the former case a GermanndashEnglish bilingual wouldcorrectly realize that forn is more English-like thanGerman-like although it is orthotactically legal in bothlanguages By contrast the latter case would implicatethat the same bilingual would only be able to estimatehow word-like forn is without differentiating betweenlanguages The effect of fine-grained sublexical andlexical statistical information in language membershipdecisions has so far not been investigated systematically(though note that both alternatives are possible within

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

580 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

models of non-selective lexical access) Bridging this gapis important for any comprehensive model of bilingualword recognition that describes language membershiprepresentations and accounts for their involvement invisual word recognition such as the BIA+ model (Dijkstraamp van Heuven 2002 van Kesteren et al 2012)

The present study

We designed the present study to shed light onthe contribution of continuous sublexical statisticsmeasured through bigram frequencies and lexicalstatistics measured through orthographic neighborhoodsizes to language membership attribution Moreover weinvestigated whether continuous language membershipinformation is ignored if a decision can be made based onan orthographic marker as suggested by Vaid and Frenck-Mestre (2002) Finally we asked whether the contributionof sublexical and lexical variables to language decisionsdepends on the output modality ndash comparing performanceacross decision and naming tasks

First we performed a corpus analysis to investigate theextent to which English and German words differ in theirbigram frequencies and orthographic neighborhood sizesin English and German We reasoned that only variablesthat are differently distributed in the two languages arepotentially relevant for language membership decisionsFor example for bigram frequency to be informative aboutlanguage membership bigrams must exist that are morefrequent in German than in English and vice versa Atthe level of orthographic neighborhoods only if Germanwords have more orthographic neighbors in Germanthan in English and vice versa can the orthographicneighborhood of a word be diagnostic of its languagemembership

In the second step (Experiment 1) we designedan experiment that probed the effect of differences inorthographic neighborhood sizes and bigram frequencymeasures between languages on language decisionbehavior of GermanndashEnglish bilinguals To isolate theeffects of differences in language-similarity statisticswe employed pseudo-words (PWs) This allowed us toeliminate the influence of additional factors such asword frequency and semantics and rendered the exactmanipulation of frequency variables more feasible Webuilt on results from the corpus analysis to identifythe relevant range for each variable Additionally wecontrasted the effects of continuously varying variableswith the effect of orthographic markers (such as pf asin Pfanne (pan) for German or gh as in laugh forEnglish) We expected to find no effects of continuousdifferences in sublexical and lexical similarity to thetwo languages if bilinguals are sensitive to orthographicmarkers only However if bilinguals are sensitive tofine-grained statistical differences between languages

continuous variation of sublexical and lexical statisticsshould affect bilingualsrsquo language membership decisions

In the third step (Experiment 2) we investigatedthe extent to which the effects of probabilistic cuesdepend on the output modality by contrasting the2-alternative language decision task of Experiment1 with a naming task The language decisiontask requires an explicit decision between the twolanguages based on sublexical and lexical cuesContrary to this naming requires a mapping fromorthographic sublexical representations to language-specific phonology which for pseudo-words does notnecessarily require involvement of lexical representations(Coltheart Rastle Perry Langdon amp Ziegler 2001)Thus we expected sublexical representations to play alarger role in the naming task than in the languagedecision task and orthographic neighborhoods to playa larger role in the language decision task than in thenaming task A comparison of naming and languagedecision tasks will thus show whether language attributionin naming would preferentially be resolved based onsublexical orthographic and phonological informationwith decreasing involvement of lexical representations

Corpus analysis

Methods

The corpus analysis was based on all spelling-correctedwords of the German and the full English Subtlexdatabases (Brysbaert Buchmeier Conrad Jacobs Boelteamp Boehl 2011) We analyzed Levenshtein distance(OLD20 Yarkoni Balota amp Yap 2008) as a measure oforthographic neighborhood size and mean and maximalbigram frequencies as measures of sublexical frequencyThese variables were calculated for each word of bothcorpuses separately for each language This made itpossible to examine the extent to which the statisticalproperties of a given variable differ between languagesand whether words of one language are probable in theother language with respect to each variable Since thefocus of this research was on characterizing propertiesof letter strings that are unique for a certain languageidentical interlingual homographs were excluded from thisanalysis

Orthographic neighborhood sizeThe Levenshtein orthographic neighborhood distance ofa letter string (Yarkoni et al 2008) is its average distanceto its 20 closest Levenshtein orthographic neighborsin a lexicon The Levenshtein distance between twoletter strings is computed as the minimal number ofletter deletions insertions and changes that is neededto transform their orthographic word forms into eachother We computed OLD20 using the ldquovwrrdquo library in

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 581

the statistical package R (Keuleers 2013) OLD20 isa variable with a strong dependency on word lengthsuch that if a specific word length is more common ina language the orthographic neighborhoods of wordsof that length are denser than for other word lengthsTo control for the difference in average word lengthbetween German and English and make OLD20 valuesin German and English comparable OLD20 was mean-normalized for each word length within each languageAll language comparisons were based on normalizedscores For each word in both corpora we computed thedifference between its average Levenshtein distances tothe 20 closest neighbors in the German Subtlex (OLDG)and in the English Subtlex (OLDE) and the differencebetween these variables (diffOLD) DiffOLD was codedsuch that positive values reflect a larger orthographicneighborhood in German than in English

Bigram frequenciesPositional log10 token bigram frequencies were computedbased on word form frequencies from the Subtlexdatabases and normalized to frequency per millionbigrams This allows better comparability acrosslanguages than normalization per million words asGerman words are longer on average Bigram frequencieswere also mean-normalized within each language Foreach word in the two corpora we computed the differencebetween the mean German bigram frequency (mBGG) andthe mean English bigram frequency (mBGE) diffBF ofits constituent bigrams

Furthermore we hypothesized that language decisionsmight be guided not only by the average difference inbigram frequency between languages but also by singlesublexical units with a strong difference in languagesimilarity For each word we thus also identified itsconstituent bigram with the largest difference betweenits frequency in German and English We denote thismaximal bigram frequency difference as maxdiffBF withpositive values for high German-typicality and negativevalues for higher English-typicality

Results and discussion

The descriptive statistics for the three difference variablesare presented in Table 1 and their distributions are plottedin Figure 1

Of the three difference variables diffOLD shows thelargest difference between its distributions in English andGerman (Figure 1 upper panel) with more than 90of words of each language having more orthographicneighbors in their language than in the respectively otherlanguage (drsquo = 27) While the distributions of maxdiffBFare also similarly different between languages (drsquo = 14)there is a large overlap between the distributions of diffBFin German and English (drsquo = 024)

Figure 1 Distribution of differences in orthographicneighborhood size (diffOLD upper panel) mean bigramfrequency differences (diffBF middle panel) and maximalbigram frequency differences (maxdiffBF bottom panel) inGerman and English SUBTLEX lexica

The corpus analysis thus reveals that lexicalneighborhoods can be very informative for the languagemembership of a word in line with previous findingsshowing larger same-language than cross-languageneighborhoods (Marian Bartolotti Chabal amp Shook2012) It also provides maxdiffBF as a potential source oflanguage membership information whereas mean bigramfrequency differences were more similar across languagesIn the following two experiments we investigate the effectsof all three variables on language attribution

Experiment 1 Language Decision Task

Methods amp materials

ParticipantsThis study was conducted with highly proficient GermanndashEnglish bilinguals (n = 25 5 male ages 21ndash35 meanage 27 years) All were students at the Freie Universitaetin Berlin and had studied English as their first foreignlanguage in high school All had spent at least 9consecutive months living in an English-speaking foreigncountry (GB USA English-speaking Canada Australiaor New Zealand) Participants were right-handed hadnormal or corrected-to-normal vision and did not sufferfrom a reading disability or other learning disordersAll participants completed an online language historyquestionnaire (adapted from Li Sepanski amp Zhao 2006)prior to participation and were only admitted to the

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

582 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 1 Descriptive statistics of the difference variables their pair-wise correlation in the English andGerman SUBTLEX lexica and Cohenrsquos d for the comparison between English and German distributions ofeach variable Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequencydifference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

SUBTLEX Lexicon

English German

Cohenrsquos d mean min max SD mean min max SD

diffOLD 27 minus17 minus54 33 10 21 minus32 75 13

diffBF 02 minus09 minus59 34 09 02 minus55 44 11

maxdiffBF 14 minus10 minus31 28 10 10 minus31 28 10

Correlations

diffBF maxdiffBF diffBF maxdiffBF

diffOLD 022 027 034 038

diffBF 03 028

Table 2 Profiles of participants in Experiment 1 and 2 There were no significant differences between participants ofExperiment 1 and 2

Experiment 1 Experiment 2

L1 (German) L2(English) L1 (German) L2(English)

mean SD mean SD mean SD mean SD

Lextale 908 69 805lowast 87 904 50 770lowast 130

reading rate wmin real words 1314 102 124 lowast 90 1300 140 1232 lowast 95

pseudo-words 835 175 813 135 835 200 786 106

Self report Age of acquisition (years) ndash ndash 97 164 ndash ndash 93 20

Proficiency1 ndash ndash 60 05 ndash ndash 59 04

Accent2 ndash ndash 24 10 ndash ndash 27 08

Notes1 On a scale of 1 (basic) ndash 7 (native)2 On a scale of 1(no accent) ndash 7 (very strong accent)lowast p lt 05 for comparison between L1 and L2

study if they fulfilled the above criteria Self-reportsof L2 proficiency were made on a 1-7 Likert scaleseparately for reading writing speaking and listeningabilities The averaged self-estimated English proficiencywas 59 (range 45ndash7) Additionally participantsrsquo generalproficiency and reading abilities were assessed after theexperiment using the LEXTALE tests of German andEnglish proficiency (Lemhoumlfer amp Broersma 2011) Thetests consist of short lexical decision tasks which includewords of varying frequency and pseudo-words The finalscore is the average percentage of correct responses towords and pseudo-words Reading abilities were assessedusing the reading and phonological decoding subtests ofthe TOWRE (Torgesen amp Rashotte 1999) for English andthe word and pseudo-word reading subtests of the SLRT-IItest (Moll amp Landerl 2010) for German Both tests assess

reading rate (wordsmin) in single-item reading of wordsand pseudo-words (PW) Participantsrsquo language profileis summarized in Table 2 Participants were recruitedthrough advertisements on campus and in mailing lists forexperiment participation All participants completed aninformed consent form prior to beginning the experimentThey were reimbursed either monetary or with coursecredit

Stimuli amp DesignThe stimulus set contained 192 marked pseudo-wordshalf of which were orthographically legal in Germanonly while the other half was orthographically legalin English only and 192 neutral pseudo-words whichwere orthographically legal in both languages Within theset of marked pseudo-words diffOLD and diffBF were

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 2: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 579

that any visual input activates orthographic lexical andphonological representations in both languages (ConradAlvarez Afonso amp Jacobs 2014 De Groot Delmaaramp Lupker 2000 Dijkstra Hilberink-Schulpen amp vanHeuven 2010 but see also Costa La Heij amp Navarrete2006)

In natural settings language membership informationcan be inferred from a variety of cues external tothe word itself such as sentential context or priorinformation about the speaker Even in the absenceof such external cues however bilinguals are ableto reliably identify the language of a single word(Casaponsa et al 2014 Grainger amp Dijkstra 1992 Vaidamp Frenck-Mestre 2002 van Kesteren Dijkstra amp deSmedt 2012) For example a trivial source for languagemembership information is the lexical word form itselfwhich ndash unless a cognate ndash is unambiguously associatedwith a certain language (Grainger amp Dijkstra 1992)Differences between orthographic scripts (ie EnglishndashChinese) or unique graphemes (ie the letter Oslash inNorwegian vs English van Kesteren et al 2012) canalso cue language membership Even in the case ofa shared orthographic system two-letter combinations(bigrams) that are orthographically legal in only oneof two languages (ie TX in Basque vs SpanishCasaponsa et al 2014 OE in French vs English Vaidamp Frenck-Mestre 2002) termed orthographic markerscan speed language categorization and lexical accessA recent extension of the most prominent cognitivemodel of visual word recognition the bilingual interactiveactivation model plus (BIA+ van Kesteren et al 2012)incorporates the effects of orthographic markers onlanguage membership decisions through inclusion ofseparate sublexical and lexical language nodes whichrepresent the language membership of language-uniqueunits ie orthographic markers and lexical word-formsrespectively Importantly in this model the languagemembership of a letter string can be identified basedon sublexical information alone This is in line with thesuggestion that orthographic markers allow for languageidentification without processing of the complete letterstring or access to lexical representations (Vaid amp Frenck-Mestre 2002) However the amount of lexical activationduring language decisions has not been explicitly tested

The evidence reviewed so far focuses exclusively ondichotomic language membership cues such as languagemembership of lexical word forms and orthographicmarkers which exist in one language only Howeverample evidence suggests that the visual word processingsystem is sensitive to continuous statistical patterns inthe native language For instance lexical orthographicneighborhood size a measure of lexical co-activationduring lexical search (Andrews 1997) is negativelycorrelated with RT on a range of tasks such as lexicaldecision and naming (Carreiras Perea amp Grainger 1997

for a review see Andrews 1997) Similarly a largebody of findings shows that the frequency of sublexicalunits modulates word processing in different languages(syllables Carreiras Alvarez amp de Vega 1993 ConradCarreiras Tamm amp Jacobs 2009 Conrad Graingeramp Jacobs 2007 Conrad amp Jacobs 2004 morphemesDeutsch Frost Pollatsek amp Rainer 2000 Frost KuglerDeutsch amp Forster 2005 bigrams Westbury amp Buchanan2002) Furthermore Bailey and Hahn (2001) concurrentlymanipulated word-likeness at sublexical and lexicallevels and demonstrated a unique contribution of eachof the two levels to word-likeness ratings of visuallyand auditory presented pseudo-words They found acorrelation between word-likeness ratings and lexicalneighborhood sizes as well as with sublexical bigramfrequencies

In bilinguals differential effects of within- andbetween-language orthographic neighborhood sizes onword processing were found (De Groot BorgwaldtBos amp van den Eijnden 2002 Lemhoumlfer DijkstraSchriefers Baayen Grainger amp Zwitserlood 2008) Forexample Lemhoumlfer and colleagues (2008) showed that ina progressive demasking task within-language neighborshave stronger effects on word recognition than between-language neighbors Moreover Conrad et al (2014)recently found that bilingual visual word recognition issensitive to the frequencies of syllabic units in bothlanguages In particular processing of sublexical units infirst and second language (L1 and L2 respectively) seemsto benefit from sublexical frequencies in the respectivenon-presented language

Given monolingualsrsquo ability to use statistical linguisticinformation and bilingualsrsquo sensitivity to linguisticstructure of both of their languages the question ariseswhether bilinguals can use statistical information forlanguage membership identification Namely differencesin frequencies of occurrence between languages couldbe used as cues to determine language membershipThis is only possible if bilinguals are able to assesssublexical (ie bigram frequencies) as well as lexical (ieorthographic neighborhood sizes) statistics separately forL1 and L2 The alternative would be that they representstatistical information in a way that summarizes acrosslanguages in which case statistical information could notbe used in language membership decisions For examplein the former case a GermanndashEnglish bilingual wouldcorrectly realize that forn is more English-like thanGerman-like although it is orthotactically legal in bothlanguages By contrast the latter case would implicatethat the same bilingual would only be able to estimatehow word-like forn is without differentiating betweenlanguages The effect of fine-grained sublexical andlexical statistical information in language membershipdecisions has so far not been investigated systematically(though note that both alternatives are possible within

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

580 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

models of non-selective lexical access) Bridging this gapis important for any comprehensive model of bilingualword recognition that describes language membershiprepresentations and accounts for their involvement invisual word recognition such as the BIA+ model (Dijkstraamp van Heuven 2002 van Kesteren et al 2012)

The present study

We designed the present study to shed light onthe contribution of continuous sublexical statisticsmeasured through bigram frequencies and lexicalstatistics measured through orthographic neighborhoodsizes to language membership attribution Moreover weinvestigated whether continuous language membershipinformation is ignored if a decision can be made based onan orthographic marker as suggested by Vaid and Frenck-Mestre (2002) Finally we asked whether the contributionof sublexical and lexical variables to language decisionsdepends on the output modality ndash comparing performanceacross decision and naming tasks

First we performed a corpus analysis to investigate theextent to which English and German words differ in theirbigram frequencies and orthographic neighborhood sizesin English and German We reasoned that only variablesthat are differently distributed in the two languages arepotentially relevant for language membership decisionsFor example for bigram frequency to be informative aboutlanguage membership bigrams must exist that are morefrequent in German than in English and vice versa Atthe level of orthographic neighborhoods only if Germanwords have more orthographic neighbors in Germanthan in English and vice versa can the orthographicneighborhood of a word be diagnostic of its languagemembership

In the second step (Experiment 1) we designedan experiment that probed the effect of differences inorthographic neighborhood sizes and bigram frequencymeasures between languages on language decisionbehavior of GermanndashEnglish bilinguals To isolate theeffects of differences in language-similarity statisticswe employed pseudo-words (PWs) This allowed us toeliminate the influence of additional factors such asword frequency and semantics and rendered the exactmanipulation of frequency variables more feasible Webuilt on results from the corpus analysis to identifythe relevant range for each variable Additionally wecontrasted the effects of continuously varying variableswith the effect of orthographic markers (such as pf asin Pfanne (pan) for German or gh as in laugh forEnglish) We expected to find no effects of continuousdifferences in sublexical and lexical similarity to thetwo languages if bilinguals are sensitive to orthographicmarkers only However if bilinguals are sensitive tofine-grained statistical differences between languages

continuous variation of sublexical and lexical statisticsshould affect bilingualsrsquo language membership decisions

In the third step (Experiment 2) we investigatedthe extent to which the effects of probabilistic cuesdepend on the output modality by contrasting the2-alternative language decision task of Experiment1 with a naming task The language decisiontask requires an explicit decision between the twolanguages based on sublexical and lexical cuesContrary to this naming requires a mapping fromorthographic sublexical representations to language-specific phonology which for pseudo-words does notnecessarily require involvement of lexical representations(Coltheart Rastle Perry Langdon amp Ziegler 2001)Thus we expected sublexical representations to play alarger role in the naming task than in the languagedecision task and orthographic neighborhoods to playa larger role in the language decision task than in thenaming task A comparison of naming and languagedecision tasks will thus show whether language attributionin naming would preferentially be resolved based onsublexical orthographic and phonological informationwith decreasing involvement of lexical representations

Corpus analysis

Methods

The corpus analysis was based on all spelling-correctedwords of the German and the full English Subtlexdatabases (Brysbaert Buchmeier Conrad Jacobs Boelteamp Boehl 2011) We analyzed Levenshtein distance(OLD20 Yarkoni Balota amp Yap 2008) as a measure oforthographic neighborhood size and mean and maximalbigram frequencies as measures of sublexical frequencyThese variables were calculated for each word of bothcorpuses separately for each language This made itpossible to examine the extent to which the statisticalproperties of a given variable differ between languagesand whether words of one language are probable in theother language with respect to each variable Since thefocus of this research was on characterizing propertiesof letter strings that are unique for a certain languageidentical interlingual homographs were excluded from thisanalysis

Orthographic neighborhood sizeThe Levenshtein orthographic neighborhood distance ofa letter string (Yarkoni et al 2008) is its average distanceto its 20 closest Levenshtein orthographic neighborsin a lexicon The Levenshtein distance between twoletter strings is computed as the minimal number ofletter deletions insertions and changes that is neededto transform their orthographic word forms into eachother We computed OLD20 using the ldquovwrrdquo library in

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 581

the statistical package R (Keuleers 2013) OLD20 isa variable with a strong dependency on word lengthsuch that if a specific word length is more common ina language the orthographic neighborhoods of wordsof that length are denser than for other word lengthsTo control for the difference in average word lengthbetween German and English and make OLD20 valuesin German and English comparable OLD20 was mean-normalized for each word length within each languageAll language comparisons were based on normalizedscores For each word in both corpora we computed thedifference between its average Levenshtein distances tothe 20 closest neighbors in the German Subtlex (OLDG)and in the English Subtlex (OLDE) and the differencebetween these variables (diffOLD) DiffOLD was codedsuch that positive values reflect a larger orthographicneighborhood in German than in English

Bigram frequenciesPositional log10 token bigram frequencies were computedbased on word form frequencies from the Subtlexdatabases and normalized to frequency per millionbigrams This allows better comparability acrosslanguages than normalization per million words asGerman words are longer on average Bigram frequencieswere also mean-normalized within each language Foreach word in the two corpora we computed the differencebetween the mean German bigram frequency (mBGG) andthe mean English bigram frequency (mBGE) diffBF ofits constituent bigrams

Furthermore we hypothesized that language decisionsmight be guided not only by the average difference inbigram frequency between languages but also by singlesublexical units with a strong difference in languagesimilarity For each word we thus also identified itsconstituent bigram with the largest difference betweenits frequency in German and English We denote thismaximal bigram frequency difference as maxdiffBF withpositive values for high German-typicality and negativevalues for higher English-typicality

Results and discussion

The descriptive statistics for the three difference variablesare presented in Table 1 and their distributions are plottedin Figure 1

Of the three difference variables diffOLD shows thelargest difference between its distributions in English andGerman (Figure 1 upper panel) with more than 90of words of each language having more orthographicneighbors in their language than in the respectively otherlanguage (drsquo = 27) While the distributions of maxdiffBFare also similarly different between languages (drsquo = 14)there is a large overlap between the distributions of diffBFin German and English (drsquo = 024)

Figure 1 Distribution of differences in orthographicneighborhood size (diffOLD upper panel) mean bigramfrequency differences (diffBF middle panel) and maximalbigram frequency differences (maxdiffBF bottom panel) inGerman and English SUBTLEX lexica

The corpus analysis thus reveals that lexicalneighborhoods can be very informative for the languagemembership of a word in line with previous findingsshowing larger same-language than cross-languageneighborhoods (Marian Bartolotti Chabal amp Shook2012) It also provides maxdiffBF as a potential source oflanguage membership information whereas mean bigramfrequency differences were more similar across languagesIn the following two experiments we investigate the effectsof all three variables on language attribution

Experiment 1 Language Decision Task

Methods amp materials

ParticipantsThis study was conducted with highly proficient GermanndashEnglish bilinguals (n = 25 5 male ages 21ndash35 meanage 27 years) All were students at the Freie Universitaetin Berlin and had studied English as their first foreignlanguage in high school All had spent at least 9consecutive months living in an English-speaking foreigncountry (GB USA English-speaking Canada Australiaor New Zealand) Participants were right-handed hadnormal or corrected-to-normal vision and did not sufferfrom a reading disability or other learning disordersAll participants completed an online language historyquestionnaire (adapted from Li Sepanski amp Zhao 2006)prior to participation and were only admitted to the

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

582 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 1 Descriptive statistics of the difference variables their pair-wise correlation in the English andGerman SUBTLEX lexica and Cohenrsquos d for the comparison between English and German distributions ofeach variable Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequencydifference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

SUBTLEX Lexicon

English German

Cohenrsquos d mean min max SD mean min max SD

diffOLD 27 minus17 minus54 33 10 21 minus32 75 13

diffBF 02 minus09 minus59 34 09 02 minus55 44 11

maxdiffBF 14 minus10 minus31 28 10 10 minus31 28 10

Correlations

diffBF maxdiffBF diffBF maxdiffBF

diffOLD 022 027 034 038

diffBF 03 028

Table 2 Profiles of participants in Experiment 1 and 2 There were no significant differences between participants ofExperiment 1 and 2

Experiment 1 Experiment 2

L1 (German) L2(English) L1 (German) L2(English)

mean SD mean SD mean SD mean SD

Lextale 908 69 805lowast 87 904 50 770lowast 130

reading rate wmin real words 1314 102 124 lowast 90 1300 140 1232 lowast 95

pseudo-words 835 175 813 135 835 200 786 106

Self report Age of acquisition (years) ndash ndash 97 164 ndash ndash 93 20

Proficiency1 ndash ndash 60 05 ndash ndash 59 04

Accent2 ndash ndash 24 10 ndash ndash 27 08

Notes1 On a scale of 1 (basic) ndash 7 (native)2 On a scale of 1(no accent) ndash 7 (very strong accent)lowast p lt 05 for comparison between L1 and L2

study if they fulfilled the above criteria Self-reportsof L2 proficiency were made on a 1-7 Likert scaleseparately for reading writing speaking and listeningabilities The averaged self-estimated English proficiencywas 59 (range 45ndash7) Additionally participantsrsquo generalproficiency and reading abilities were assessed after theexperiment using the LEXTALE tests of German andEnglish proficiency (Lemhoumlfer amp Broersma 2011) Thetests consist of short lexical decision tasks which includewords of varying frequency and pseudo-words The finalscore is the average percentage of correct responses towords and pseudo-words Reading abilities were assessedusing the reading and phonological decoding subtests ofthe TOWRE (Torgesen amp Rashotte 1999) for English andthe word and pseudo-word reading subtests of the SLRT-IItest (Moll amp Landerl 2010) for German Both tests assess

reading rate (wordsmin) in single-item reading of wordsand pseudo-words (PW) Participantsrsquo language profileis summarized in Table 2 Participants were recruitedthrough advertisements on campus and in mailing lists forexperiment participation All participants completed aninformed consent form prior to beginning the experimentThey were reimbursed either monetary or with coursecredit

Stimuli amp DesignThe stimulus set contained 192 marked pseudo-wordshalf of which were orthographically legal in Germanonly while the other half was orthographically legalin English only and 192 neutral pseudo-words whichwere orthographically legal in both languages Within theset of marked pseudo-words diffOLD and diffBF were

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 3: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

580 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

models of non-selective lexical access) Bridging this gapis important for any comprehensive model of bilingualword recognition that describes language membershiprepresentations and accounts for their involvement invisual word recognition such as the BIA+ model (Dijkstraamp van Heuven 2002 van Kesteren et al 2012)

The present study

We designed the present study to shed light onthe contribution of continuous sublexical statisticsmeasured through bigram frequencies and lexicalstatistics measured through orthographic neighborhoodsizes to language membership attribution Moreover weinvestigated whether continuous language membershipinformation is ignored if a decision can be made based onan orthographic marker as suggested by Vaid and Frenck-Mestre (2002) Finally we asked whether the contributionof sublexical and lexical variables to language decisionsdepends on the output modality ndash comparing performanceacross decision and naming tasks

First we performed a corpus analysis to investigate theextent to which English and German words differ in theirbigram frequencies and orthographic neighborhood sizesin English and German We reasoned that only variablesthat are differently distributed in the two languages arepotentially relevant for language membership decisionsFor example for bigram frequency to be informative aboutlanguage membership bigrams must exist that are morefrequent in German than in English and vice versa Atthe level of orthographic neighborhoods only if Germanwords have more orthographic neighbors in Germanthan in English and vice versa can the orthographicneighborhood of a word be diagnostic of its languagemembership

In the second step (Experiment 1) we designedan experiment that probed the effect of differences inorthographic neighborhood sizes and bigram frequencymeasures between languages on language decisionbehavior of GermanndashEnglish bilinguals To isolate theeffects of differences in language-similarity statisticswe employed pseudo-words (PWs) This allowed us toeliminate the influence of additional factors such asword frequency and semantics and rendered the exactmanipulation of frequency variables more feasible Webuilt on results from the corpus analysis to identifythe relevant range for each variable Additionally wecontrasted the effects of continuously varying variableswith the effect of orthographic markers (such as pf asin Pfanne (pan) for German or gh as in laugh forEnglish) We expected to find no effects of continuousdifferences in sublexical and lexical similarity to thetwo languages if bilinguals are sensitive to orthographicmarkers only However if bilinguals are sensitive tofine-grained statistical differences between languages

continuous variation of sublexical and lexical statisticsshould affect bilingualsrsquo language membership decisions

In the third step (Experiment 2) we investigatedthe extent to which the effects of probabilistic cuesdepend on the output modality by contrasting the2-alternative language decision task of Experiment1 with a naming task The language decisiontask requires an explicit decision between the twolanguages based on sublexical and lexical cuesContrary to this naming requires a mapping fromorthographic sublexical representations to language-specific phonology which for pseudo-words does notnecessarily require involvement of lexical representations(Coltheart Rastle Perry Langdon amp Ziegler 2001)Thus we expected sublexical representations to play alarger role in the naming task than in the languagedecision task and orthographic neighborhoods to playa larger role in the language decision task than in thenaming task A comparison of naming and languagedecision tasks will thus show whether language attributionin naming would preferentially be resolved based onsublexical orthographic and phonological informationwith decreasing involvement of lexical representations

Corpus analysis

Methods

The corpus analysis was based on all spelling-correctedwords of the German and the full English Subtlexdatabases (Brysbaert Buchmeier Conrad Jacobs Boelteamp Boehl 2011) We analyzed Levenshtein distance(OLD20 Yarkoni Balota amp Yap 2008) as a measure oforthographic neighborhood size and mean and maximalbigram frequencies as measures of sublexical frequencyThese variables were calculated for each word of bothcorpuses separately for each language This made itpossible to examine the extent to which the statisticalproperties of a given variable differ between languagesand whether words of one language are probable in theother language with respect to each variable Since thefocus of this research was on characterizing propertiesof letter strings that are unique for a certain languageidentical interlingual homographs were excluded from thisanalysis

Orthographic neighborhood sizeThe Levenshtein orthographic neighborhood distance ofa letter string (Yarkoni et al 2008) is its average distanceto its 20 closest Levenshtein orthographic neighborsin a lexicon The Levenshtein distance between twoletter strings is computed as the minimal number ofletter deletions insertions and changes that is neededto transform their orthographic word forms into eachother We computed OLD20 using the ldquovwrrdquo library in

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 581

the statistical package R (Keuleers 2013) OLD20 isa variable with a strong dependency on word lengthsuch that if a specific word length is more common ina language the orthographic neighborhoods of wordsof that length are denser than for other word lengthsTo control for the difference in average word lengthbetween German and English and make OLD20 valuesin German and English comparable OLD20 was mean-normalized for each word length within each languageAll language comparisons were based on normalizedscores For each word in both corpora we computed thedifference between its average Levenshtein distances tothe 20 closest neighbors in the German Subtlex (OLDG)and in the English Subtlex (OLDE) and the differencebetween these variables (diffOLD) DiffOLD was codedsuch that positive values reflect a larger orthographicneighborhood in German than in English

Bigram frequenciesPositional log10 token bigram frequencies were computedbased on word form frequencies from the Subtlexdatabases and normalized to frequency per millionbigrams This allows better comparability acrosslanguages than normalization per million words asGerman words are longer on average Bigram frequencieswere also mean-normalized within each language Foreach word in the two corpora we computed the differencebetween the mean German bigram frequency (mBGG) andthe mean English bigram frequency (mBGE) diffBF ofits constituent bigrams

Furthermore we hypothesized that language decisionsmight be guided not only by the average difference inbigram frequency between languages but also by singlesublexical units with a strong difference in languagesimilarity For each word we thus also identified itsconstituent bigram with the largest difference betweenits frequency in German and English We denote thismaximal bigram frequency difference as maxdiffBF withpositive values for high German-typicality and negativevalues for higher English-typicality

Results and discussion

The descriptive statistics for the three difference variablesare presented in Table 1 and their distributions are plottedin Figure 1

Of the three difference variables diffOLD shows thelargest difference between its distributions in English andGerman (Figure 1 upper panel) with more than 90of words of each language having more orthographicneighbors in their language than in the respectively otherlanguage (drsquo = 27) While the distributions of maxdiffBFare also similarly different between languages (drsquo = 14)there is a large overlap between the distributions of diffBFin German and English (drsquo = 024)

Figure 1 Distribution of differences in orthographicneighborhood size (diffOLD upper panel) mean bigramfrequency differences (diffBF middle panel) and maximalbigram frequency differences (maxdiffBF bottom panel) inGerman and English SUBTLEX lexica

The corpus analysis thus reveals that lexicalneighborhoods can be very informative for the languagemembership of a word in line with previous findingsshowing larger same-language than cross-languageneighborhoods (Marian Bartolotti Chabal amp Shook2012) It also provides maxdiffBF as a potential source oflanguage membership information whereas mean bigramfrequency differences were more similar across languagesIn the following two experiments we investigate the effectsof all three variables on language attribution

Experiment 1 Language Decision Task

Methods amp materials

ParticipantsThis study was conducted with highly proficient GermanndashEnglish bilinguals (n = 25 5 male ages 21ndash35 meanage 27 years) All were students at the Freie Universitaetin Berlin and had studied English as their first foreignlanguage in high school All had spent at least 9consecutive months living in an English-speaking foreigncountry (GB USA English-speaking Canada Australiaor New Zealand) Participants were right-handed hadnormal or corrected-to-normal vision and did not sufferfrom a reading disability or other learning disordersAll participants completed an online language historyquestionnaire (adapted from Li Sepanski amp Zhao 2006)prior to participation and were only admitted to the

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

582 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 1 Descriptive statistics of the difference variables their pair-wise correlation in the English andGerman SUBTLEX lexica and Cohenrsquos d for the comparison between English and German distributions ofeach variable Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequencydifference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

SUBTLEX Lexicon

English German

Cohenrsquos d mean min max SD mean min max SD

diffOLD 27 minus17 minus54 33 10 21 minus32 75 13

diffBF 02 minus09 minus59 34 09 02 minus55 44 11

maxdiffBF 14 minus10 minus31 28 10 10 minus31 28 10

Correlations

diffBF maxdiffBF diffBF maxdiffBF

diffOLD 022 027 034 038

diffBF 03 028

Table 2 Profiles of participants in Experiment 1 and 2 There were no significant differences between participants ofExperiment 1 and 2

Experiment 1 Experiment 2

L1 (German) L2(English) L1 (German) L2(English)

mean SD mean SD mean SD mean SD

Lextale 908 69 805lowast 87 904 50 770lowast 130

reading rate wmin real words 1314 102 124 lowast 90 1300 140 1232 lowast 95

pseudo-words 835 175 813 135 835 200 786 106

Self report Age of acquisition (years) ndash ndash 97 164 ndash ndash 93 20

Proficiency1 ndash ndash 60 05 ndash ndash 59 04

Accent2 ndash ndash 24 10 ndash ndash 27 08

Notes1 On a scale of 1 (basic) ndash 7 (native)2 On a scale of 1(no accent) ndash 7 (very strong accent)lowast p lt 05 for comparison between L1 and L2

study if they fulfilled the above criteria Self-reportsof L2 proficiency were made on a 1-7 Likert scaleseparately for reading writing speaking and listeningabilities The averaged self-estimated English proficiencywas 59 (range 45ndash7) Additionally participantsrsquo generalproficiency and reading abilities were assessed after theexperiment using the LEXTALE tests of German andEnglish proficiency (Lemhoumlfer amp Broersma 2011) Thetests consist of short lexical decision tasks which includewords of varying frequency and pseudo-words The finalscore is the average percentage of correct responses towords and pseudo-words Reading abilities were assessedusing the reading and phonological decoding subtests ofthe TOWRE (Torgesen amp Rashotte 1999) for English andthe word and pseudo-word reading subtests of the SLRT-IItest (Moll amp Landerl 2010) for German Both tests assess

reading rate (wordsmin) in single-item reading of wordsand pseudo-words (PW) Participantsrsquo language profileis summarized in Table 2 Participants were recruitedthrough advertisements on campus and in mailing lists forexperiment participation All participants completed aninformed consent form prior to beginning the experimentThey were reimbursed either monetary or with coursecredit

Stimuli amp DesignThe stimulus set contained 192 marked pseudo-wordshalf of which were orthographically legal in Germanonly while the other half was orthographically legalin English only and 192 neutral pseudo-words whichwere orthographically legal in both languages Within theset of marked pseudo-words diffOLD and diffBF were

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 4: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 581

the statistical package R (Keuleers 2013) OLD20 isa variable with a strong dependency on word lengthsuch that if a specific word length is more common ina language the orthographic neighborhoods of wordsof that length are denser than for other word lengthsTo control for the difference in average word lengthbetween German and English and make OLD20 valuesin German and English comparable OLD20 was mean-normalized for each word length within each languageAll language comparisons were based on normalizedscores For each word in both corpora we computed thedifference between its average Levenshtein distances tothe 20 closest neighbors in the German Subtlex (OLDG)and in the English Subtlex (OLDE) and the differencebetween these variables (diffOLD) DiffOLD was codedsuch that positive values reflect a larger orthographicneighborhood in German than in English

Bigram frequenciesPositional log10 token bigram frequencies were computedbased on word form frequencies from the Subtlexdatabases and normalized to frequency per millionbigrams This allows better comparability acrosslanguages than normalization per million words asGerman words are longer on average Bigram frequencieswere also mean-normalized within each language Foreach word in the two corpora we computed the differencebetween the mean German bigram frequency (mBGG) andthe mean English bigram frequency (mBGE) diffBF ofits constituent bigrams

Furthermore we hypothesized that language decisionsmight be guided not only by the average difference inbigram frequency between languages but also by singlesublexical units with a strong difference in languagesimilarity For each word we thus also identified itsconstituent bigram with the largest difference betweenits frequency in German and English We denote thismaximal bigram frequency difference as maxdiffBF withpositive values for high German-typicality and negativevalues for higher English-typicality

Results and discussion

The descriptive statistics for the three difference variablesare presented in Table 1 and their distributions are plottedin Figure 1

Of the three difference variables diffOLD shows thelargest difference between its distributions in English andGerman (Figure 1 upper panel) with more than 90of words of each language having more orthographicneighbors in their language than in the respectively otherlanguage (drsquo = 27) While the distributions of maxdiffBFare also similarly different between languages (drsquo = 14)there is a large overlap between the distributions of diffBFin German and English (drsquo = 024)

Figure 1 Distribution of differences in orthographicneighborhood size (diffOLD upper panel) mean bigramfrequency differences (diffBF middle panel) and maximalbigram frequency differences (maxdiffBF bottom panel) inGerman and English SUBTLEX lexica

The corpus analysis thus reveals that lexicalneighborhoods can be very informative for the languagemembership of a word in line with previous findingsshowing larger same-language than cross-languageneighborhoods (Marian Bartolotti Chabal amp Shook2012) It also provides maxdiffBF as a potential source oflanguage membership information whereas mean bigramfrequency differences were more similar across languagesIn the following two experiments we investigate the effectsof all three variables on language attribution

Experiment 1 Language Decision Task

Methods amp materials

ParticipantsThis study was conducted with highly proficient GermanndashEnglish bilinguals (n = 25 5 male ages 21ndash35 meanage 27 years) All were students at the Freie Universitaetin Berlin and had studied English as their first foreignlanguage in high school All had spent at least 9consecutive months living in an English-speaking foreigncountry (GB USA English-speaking Canada Australiaor New Zealand) Participants were right-handed hadnormal or corrected-to-normal vision and did not sufferfrom a reading disability or other learning disordersAll participants completed an online language historyquestionnaire (adapted from Li Sepanski amp Zhao 2006)prior to participation and were only admitted to the

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

582 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 1 Descriptive statistics of the difference variables their pair-wise correlation in the English andGerman SUBTLEX lexica and Cohenrsquos d for the comparison between English and German distributions ofeach variable Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequencydifference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

SUBTLEX Lexicon

English German

Cohenrsquos d mean min max SD mean min max SD

diffOLD 27 minus17 minus54 33 10 21 minus32 75 13

diffBF 02 minus09 minus59 34 09 02 minus55 44 11

maxdiffBF 14 minus10 minus31 28 10 10 minus31 28 10

Correlations

diffBF maxdiffBF diffBF maxdiffBF

diffOLD 022 027 034 038

diffBF 03 028

Table 2 Profiles of participants in Experiment 1 and 2 There were no significant differences between participants ofExperiment 1 and 2

Experiment 1 Experiment 2

L1 (German) L2(English) L1 (German) L2(English)

mean SD mean SD mean SD mean SD

Lextale 908 69 805lowast 87 904 50 770lowast 130

reading rate wmin real words 1314 102 124 lowast 90 1300 140 1232 lowast 95

pseudo-words 835 175 813 135 835 200 786 106

Self report Age of acquisition (years) ndash ndash 97 164 ndash ndash 93 20

Proficiency1 ndash ndash 60 05 ndash ndash 59 04

Accent2 ndash ndash 24 10 ndash ndash 27 08

Notes1 On a scale of 1 (basic) ndash 7 (native)2 On a scale of 1(no accent) ndash 7 (very strong accent)lowast p lt 05 for comparison between L1 and L2

study if they fulfilled the above criteria Self-reportsof L2 proficiency were made on a 1-7 Likert scaleseparately for reading writing speaking and listeningabilities The averaged self-estimated English proficiencywas 59 (range 45ndash7) Additionally participantsrsquo generalproficiency and reading abilities were assessed after theexperiment using the LEXTALE tests of German andEnglish proficiency (Lemhoumlfer amp Broersma 2011) Thetests consist of short lexical decision tasks which includewords of varying frequency and pseudo-words The finalscore is the average percentage of correct responses towords and pseudo-words Reading abilities were assessedusing the reading and phonological decoding subtests ofthe TOWRE (Torgesen amp Rashotte 1999) for English andthe word and pseudo-word reading subtests of the SLRT-IItest (Moll amp Landerl 2010) for German Both tests assess

reading rate (wordsmin) in single-item reading of wordsand pseudo-words (PW) Participantsrsquo language profileis summarized in Table 2 Participants were recruitedthrough advertisements on campus and in mailing lists forexperiment participation All participants completed aninformed consent form prior to beginning the experimentThey were reimbursed either monetary or with coursecredit

Stimuli amp DesignThe stimulus set contained 192 marked pseudo-wordshalf of which were orthographically legal in Germanonly while the other half was orthographically legalin English only and 192 neutral pseudo-words whichwere orthographically legal in both languages Within theset of marked pseudo-words diffOLD and diffBF were

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 5: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

582 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 1 Descriptive statistics of the difference variables their pair-wise correlation in the English andGerman SUBTLEX lexica and Cohenrsquos d for the comparison between English and German distributions ofeach variable Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequencydifference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

SUBTLEX Lexicon

English German

Cohenrsquos d mean min max SD mean min max SD

diffOLD 27 minus17 minus54 33 10 21 minus32 75 13

diffBF 02 minus09 minus59 34 09 02 minus55 44 11

maxdiffBF 14 minus10 minus31 28 10 10 minus31 28 10

Correlations

diffBF maxdiffBF diffBF maxdiffBF

diffOLD 022 027 034 038

diffBF 03 028

Table 2 Profiles of participants in Experiment 1 and 2 There were no significant differences between participants ofExperiment 1 and 2

Experiment 1 Experiment 2

L1 (German) L2(English) L1 (German) L2(English)

mean SD mean SD mean SD mean SD

Lextale 908 69 805lowast 87 904 50 770lowast 130

reading rate wmin real words 1314 102 124 lowast 90 1300 140 1232 lowast 95

pseudo-words 835 175 813 135 835 200 786 106

Self report Age of acquisition (years) ndash ndash 97 164 ndash ndash 93 20

Proficiency1 ndash ndash 60 05 ndash ndash 59 04

Accent2 ndash ndash 24 10 ndash ndash 27 08

Notes1 On a scale of 1 (basic) ndash 7 (native)2 On a scale of 1(no accent) ndash 7 (very strong accent)lowast p lt 05 for comparison between L1 and L2

study if they fulfilled the above criteria Self-reportsof L2 proficiency were made on a 1-7 Likert scaleseparately for reading writing speaking and listeningabilities The averaged self-estimated English proficiencywas 59 (range 45ndash7) Additionally participantsrsquo generalproficiency and reading abilities were assessed after theexperiment using the LEXTALE tests of German andEnglish proficiency (Lemhoumlfer amp Broersma 2011) Thetests consist of short lexical decision tasks which includewords of varying frequency and pseudo-words The finalscore is the average percentage of correct responses towords and pseudo-words Reading abilities were assessedusing the reading and phonological decoding subtests ofthe TOWRE (Torgesen amp Rashotte 1999) for English andthe word and pseudo-word reading subtests of the SLRT-IItest (Moll amp Landerl 2010) for German Both tests assess

reading rate (wordsmin) in single-item reading of wordsand pseudo-words (PW) Participantsrsquo language profileis summarized in Table 2 Participants were recruitedthrough advertisements on campus and in mailing lists forexperiment participation All participants completed aninformed consent form prior to beginning the experimentThey were reimbursed either monetary or with coursecredit

Stimuli amp DesignThe stimulus set contained 192 marked pseudo-wordshalf of which were orthographically legal in Germanonly while the other half was orthographically legalin English only and 192 neutral pseudo-words whichwere orthographically legal in both languages Within theset of marked pseudo-words diffOLD and diffBF were

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 6: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 583

Table 3 Properties of neutral pseudo-words stimuli ofExperiments 1 and 2 Neutral pseudo-words containedonly bigrams that were legal in both languagesOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

distribution correlations

mean min max diffBF maxdiffBF

diffOLD minus016 minus260 240 024 038

diffBF 010 minus270 230 060

maxdiffBF 005 minus100 110

varied parametrically Within the set of neutral pseudo-words diffOLD diffBF and maxdiffBF were variedparametrically Pseudo-words were selected from theEnglish lexicon project (Balota Yap Cortese HutchisonKessler Loftis Neely Nelson Simpson amp Treiman2007) a German pseudo-word database provided byauthors MC and AA as well as a pseudo-word setcreated specifically for this study Note that all markedpseudo-words were composed of letters that exist in bothlanguages (ie excluding the German letters ldquoauml ouml uumlszligrdquo) such that decisions based on low-level visual pop-outeffects would not be possible

Neutral pseudo-wordsNeutral PWs were orthographically legal and pronounce-able with the same number of phonemes and syllableswhen pronounced in either language Selection of neutralpseudo-words was guided by three considerations Firstwe ensured that for all three variables ndash diffBF maxdiffBFand diffOLD ndash the range of overlap between their Englishand German distributions was covered (Figure 1 for thedistributions in the corpus and Table 3 for properties of thestimulus set) Second for each of the variables half of thestimuli had positive values and half had negative valuessuch that an approximately equal number of neutral PWwas German-like or English-like respectively in each ofthe three variables Third the pseudo-words were chosensuch as to reduce the pair-wise correlations between thethree variables as well as their correlations with pseudo-word length (see Table 3) Note that the correlationbetween diffBF and maxdiffBF in neutral pseudo-wordsis high as the two variables rely on the same informationsource

Marked pseudo-wordsMarked PWs contained a letter combination (1ndash3 letters)that violated orthographic rules of one of the languages

Table 4 Properties of marked pseudo-words MarkedPWs contained at least one bigram that was legal in onelanguage (ie the marker language) and had a frequencyof 0 in monomorphemic words of the other languageOrthographic neighborhood size diffOLD = OLDE ndashOLDG mean bigram frequency difference diffBF =BFG ndash BFE maximal bigram frequency differencemaxdiffBF is the maximal bigram frequency differenceacross all bigrams of a word

English-marked PW German-marked PW

mean min max mean min max

diffOLD minus090 minus400 100 070 minus200 270

diffBF minus070 minus360 160 050 minus240 310

maxdiffBF minus180 minus270 minus120 160 110 270

(such as th ght or word-final y for English or pf hlor word-middle sch for German markers) The bigramfrequency of marker bigrams is not necessarily 0 in therespective other language if computed across the wholeSUBTLEX because these letter combinations could occurin rare cases at the boundaries between (otherwise free)morphemes or in loan words and proper names Forexample the German marker pf occurs in English wordssuch as cuPFul or camPFire but never as a graphemewithin a morpheme or syllable Similarly the Englishmarker th occurs in German names (eg FuerTH)Greek loan words (eg THeologie (theology)) and asa bigram unifying two normally free morphemes (egachTHundert (eight hundred)) but not as a grapheme inother etymologically German words Since all our PWswere mono-morphemic we recomputed the frequenciesfor all orthographic markers based on mono-morphemicwords (that is consisting of only one root morphemeplus a simple ending) of the relevant word lengthsIn the resulting set of words marker frequencies were0 in the respectively other language As orthographicmarkers coincide with bigrams that define maxdiffBFvalues marked pseudo-words were chosen to differparametrically in their diffBF and diffOLD values onlywhereas English-marked and German-marked pseudo-words were balanced with respect to maxdiffBF values(correlation between diffOLD and diffBF rE = minus19rG = 42 see Table 4 for descriptive statistics) Lengthand syllable number were matched across marker typeconditions and did not correlate with experimentalvariables

To further ensure that marked pseudo-words wereindeed pronounceable and orthographically legal in onelanguage only and neutral pseudo-words in both allpseudo-words were rated for their pronounceability andorthographic legality in German and English by 3 native

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 7: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

584 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

speakers of German and English respectively Onlypseudo-words that were rated as pronounceable andorthographically legal in both languages by all refereeswere included in the study as neutral pseudo-wordsSimilarly marked pseudo-words were only included ifall referees rated them as pronounceable in the markinglanguage and illegal in the other language

ProcedureParticipants performed a speeded language decision taskin which they were required to categorize whether apseudo-word was more likely to be a German or anEnglish word Stimuli were presented on a 18primeprime computerscreen (Arial font size 40) using the Psychtoolbox(Brainard 1997 Kleiner Brainard Pelli Ingling Murrayamp Broussard 2007) for MATLAB (version 7100Mathworks Inc Natick Massachusetts) A trial consistedof a fixation cross presented for 800 ms followedby presentation of the stimulus until the participantresponded for at most 2 sec Responses were given bybutton press on a custom-made response box with the twoindex fingers Buttons were assigned to a language byGerman and British flags in the respective corners of thescreen below the stimulus Button assignment remainedconstant within but switched between blocks On eachtrial response latency (RT) and language decision wererecorded

The task began with instructions and an example trialfollowed by the 384 experimental trials presented in 8blocks of equal length1 Pseudo-randomization ensuredthat not more than three equally marked items wouldappear consecutively Participants could have self-pacedbreaks after each block If a participant failed to respondbefore the time-out for more than three times he or shewas reminded to respond faster Subsequent to completionof the language decision task participants completed theproficiency tests described above

Statistical analysisAll statistical analyses were performed in the opensource statistical programming environment R (R coreteam 2013) RTs were analysed with linear mixed-effects models and language decisions were analysedusing logistic mixed-effects models (Baayen Davidsonamp Bates 2008 Jaeger 2008) as implemented in thelme4 package for R (Bates Maechler Bolker amp Walker2014) and included subjects and items as random factorsDue to the complex random factor structure of our datathe implementation of maximal random factor structurewas not practicable Instead we modelled the randomfactor structure with the intercept and the maximal order

1 The first 2 blocks contained neutral pseudo-words only Block typedid not affect the variables of interest and hence results are reportedcollapsed across block types

interaction of fixed factors as recently suggested (Barr2013) Significance of fixed effects was assessed throughtype-III Wald-tests of single parameter estimates Wefollow the convention of the lme4 package by reportingthe z-test statistic for logistic mixed-effects models andthe χ2-test statistic for linear models (note though thatboth test statistics are equivalent)

Note that it is not possible to define correctand wrong responses for neutral pseudo-words sincemost of them contain conflicting language membershipinformation and no exclusive language cues Thuslanguage decision analyses for neutral pseudo-wordsmodelled the probability of English responses Howeverfor marked pseudo-words the marking creates a strongpreference for the marker language Thus for markedpseudo-words we conducted analyses on the marker-incongruent responses to which we refer as errorresponses

First to examine whether orthographic markers biasedlanguage decisions we analysed the effects of marker type(German (G) vs English (E) vs neutral (N))2 Then theeffects of continuous variables on language decision andresponse latencies were analysed separately for neutraland marked pseudo-words

Outlier exclusion was performed for each participantand separately for marked and neutral PWs based onseparate multiple regression models for each participantcontaining the same factors that were used for RTanalyses The expected RT for each trial was determinedand trials with a difference of more than 25 residualsrsquo SDbetween expected and actual RT were labelled as outliersand discarded from further analyses

Results

Data of all participants were included in the analysesOutlier analysis led to a removal of 25 of the trialsFor illustration purposes only continuous variables weresubdivided in three equally large subsets of which thetwo extreme ones are depicted in figures with the labelldquoGermanrdquo for the subset with most positive ie mostGerman-like values and the label ldquoEnglishrdquo for the mostnegative ie most English-like values

Effect of marker presence on language decisionsWe analyzed the effect of marker language (G vs E vs N)on the probability of English responses with a logisticmixed-effects model which included marker languageas a fixed factor intercepts for subject and items asrandom factors and random slopes for marker language

2 Due to the high number of marker-congruent responses for markedPWs there were not enough error trials (lt 15 for most participants)to analyze RTs across marked and neutral PWs including response asfactor Thus we analyze RTs separately for marked and neutral PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 8: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 585

Table 5 Mean and standard deviations (SD) for RTs to marked and neutral PW in Experiment 1 and 2

RT

Response German Response English Response English

Exp 1 mean SD mean SD mean SD

Markedness English 853 315 746 204 90 29

German 776 220 859 316 16 36

Neutral 871 279 905 292 45 50

Exp 2

mean SD mean SD mean SD

Markedness English 758 244 778 311 79 41

German 730 215 792 307 17 38

Neutral 712 211 768 318 42 49

within subject Marker language was contrast-coded inthe model with the two orthogonal contrasts English-marked and neutral versus German-marked and English-marked vs neutral PWs Marker language influenced theattribution of pseudo-words towards that language (E90 G 17 N 45 response English see Table 5)χ2 (1) = 4629 p lt 001 Planned comparisons whichwere directly encoded in the model showed that English-marked and neutral PWs were more often classified asEnglish than German-marked PWs b = 208 SD = 011z = 187 p lt 001 and that the same held for English-marked as compared to neutral pseudo-words b = 14SD = 008 z = 180 p lt 001

Effects of continuous variables in the absence ofmarkers Language decisionThe analysis of language decisions on neutral PWsinvolved the continuous variables diffBF maxdiffBF anddiffOLD as well as all possible interactions between themas fixed effects Random factor structure of the modelincluded intercepts for subjects and items as well asrandom slopes for the interaction of diffBF maxdiffBFand diffOLD As expected neutral pseudo-words weremore often categorized as similar to the language in whichtheir orthographic neighborhood was larger b = minus028SD = 007 z = 39 p lt 001 (Figure 2) No significanteffects were obtained for maxdiffBF and diffBF or for anyinteractions

Effects of continuous variables in the absence ofmarkers Response timesThe analysis of response times to neutral PWs (Table 6)contained the fixed factors response language (G vsE dummy coded) and the continuous variables diffBFmaxdiffBF and diffOLD Random factor structure of themodel included intercepts for subjects and items randomslopes for the interaction of response diffBF maxdiffBFand diffOLD within subjects and random slopes for

response within items Responses ldquoGermanrdquo were overallfaster than responses ldquoEnglishrdquo (see Table 5 for meanRTs) While there was no main effect for diffOLD itsinteraction with response was significant (Figure 3b)increases in diffOLD (ie increasingly larger Germancompared to English neighborhood) resulted in fasterreaction times (b =minus354) for ldquoGermanrdquo responses whileslowing ldquoEnglishrdquo responses (b = ndash354 + 2251 19)A similar marginally significant pattern was foundfor maxdiffBF (Figure 3c b response German minus12b response English 14) ldquoGermanrdquo responses were faster ifmaxdiffBF was positive (ie German-typical) whereasldquoEnglishrdquo responses were faster for negative maxdiffBFvalues (ie English-typical) The significant interactionof diffOLD and maxdiffBF (Figure 3a) resulted fromfaster responses for items with consistent diffOLDand maxdiffBF values and slower responses wheneither positive diffOLD was accompanied by negativemaxdiffBF values or vice versa In other words responseswere fast if both variables supported the choice of the samelanguage and slowed if they pointed to different languages(eg when a PW had more orthographic neighbors inEnglish than in German but a more German-typicalmaxdiffBF value) supporting the importance of bothvariables for the decision process All other main effectsand interactions were not significant

Interactions of markers and continuous variables3Language decisionAnalysis of errors for marked PWs contained thefixed factors marker language (G vs E) and thecontinuous variables diffBF and diffOLD as well asall of their interactions Random factor structure of

3 In both analyses marker language was dummy-coded with Englishas baseline such that main effects of continuous variables reflectedtheir effect for English-marked PWs and their interaction with markerlanguage the addition in beta for German-marked PWs as comparedto English-marked PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 9: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

586 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 6 Summary of linear mixed-effect regression for reaction times to neutral PW in Experiment 1 Orthographicneighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency difference diffBF = BFG ndash BFE maximalbigram frequency difference maxdiffBF is the maximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ 2 p-value Significance

Intercept 87774 3248 2702 7303 lt 001 lowastlowastdiffOLD minus354 600 minus059 035 56

diffBF minus1673 899 minus186 346 06 +

maxdiffBF minus1158 957 minus121 147 23

Response1 4749 893 532 2827 lt 001 lowastlowastdiffOLD diffBF 1259 931 135 183 18

diffOLD maxdiffBF minus1825 890 minus205 421 04 lowastdiffBF maxdiffBF minus485 1028 minus047 022 64

diffOLD Response 2251 882 255 652 01 lowastdiffBF Response 165 1213 014 002 89

maxdiffBF Response 2554 1391 184 337 07 +

diffOLD diffBF maxdiffBF 1189 1041 114 131 25

diffOLD diffBF Response minus1524 1309 minus116 135 24

diffOLD maxdiffBF Response 962 1300 074 055 46

diffBF maxdiffBF Response minus2260 1409 minus16 257 11

diffOLD diffBF maxdiffBF Response minus1478 1450 minus102 104 31

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

Experiment 1 Experiment 2

30

40

50

60

70

German English German EnglishdiffOLD

R

espo

nse

Eng

lish

maxdiffBF

German

English

Figure 2 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) and maximal bigramfrequency difference (maxdiffBF) on language decisions for neutral pseudo-words in Experiments 1 and 2

the model included intercepts for subjects and itemsand random slopes for the interaction of responseand diffOLD within subjects Overall above 80 ofresponses to marked pseudo-words were in line with theirmarker although more marker-incongruent responseswere made for German-marked than for English-markedPWs b = 06 SD = 022 z = 28 p = 006 (G 17E 10 incongruent responses see Table 5) Cruciallythe effect of marker language interacted with diffOLDb = minus06 SD = 02 z = ndash33 p lt001 (Figure 4a)Planned comparisons showed an increase in marker-

incongruent responses for German-marked PWs with adecrease in diffOLD (ie more English neighbors thanGerman neighbors) b = minus04 SD = 01 z = minus37 plt 001 whereas diffOLD had no effect on responses forEnglish-marked PWs (p gt 1)

Interactions of markers and continuous variablesResponse times

Analysis of RTs to marked PWs involved the fixedfactors marker language (G vs E) diffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 10: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 587

Table 7 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 1 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

(Intercept) 74556 2469 3020 91195 lt 001 lowastlowastdiffOLD 736 694 106 113 29

diffBF minus1226 842 minus146 212 15

Marker Language1 4767 1393 342 1170 lt 001 lowastlowastdiffOLD diffBF minus323 641 minus050 025 61

diffOLD Marker Language minus2539 965 minus263 693 008 lowastdiffBF Marker Language 379 1093 035 012 73

diffOLD diffBF Marker Language 1580 850 186 346 06 +

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 0011 Marker language was dummy-coded with English as baseline

800

850

900

950

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p = 04

800

850

900

950

German EnglishResponse

RT

[mse

c]b) diffOLD x Response p = 01

800

850

900

950

German EnglishResponse

RT

[mse

c]

c) maxdiffBF x Response p = 07

maxdiffBF German English

diffOLD German English

Figure 3 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) and language decisions (response) on RTs to neutral pseudo-words in Experiment 1 a)Interaction effect of diffOLD and maxdiffBF b) Interaction effect of diffOLD and response c) marginally significantinteraction effect of maxdiffBF and response

(Table 7) Random factor structure of the model includedintercepts for subjects and items and random slopesfor the interaction of marker language diffBF anddiffOLD Participants made 12 errors on average (English-marked 2ndash24 error trials German-marked 6ndash33 error

trials) which was not sufficient for an RT analysison error trials Thus RTs were analyzed for correctresponses only The results largely mirrored the patternof language decisions on marked PWs Responses toGerman PWs were slower than responses to English PWs

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 11: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

588 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Figure 4 Visualization of interaction effects of differences in orthographic neighborhood size (diffOLD) and markerlanguage on the errors and RTs to marked pseudo-words in Experiments 1 and 2

(b = 477 see Table 7) Moreover there was a significant2-way interaction of marker language and diffOLD (b= minus254 Figure 4b) and a marginally significant 3-wayinteraction between diffOLD diffBF and marker language(b = 158) A separate model for English-marked PWsshowed no effects of the continuous variables for English-marked PWs (p gt 1) However the separate model forGerman-marked PWs showed that responses for German-marked PWs were slowed for larger English than Germanneighborhoods (ie decrease in diffOLD) b = minus189 SD= 65 t = minus280 χ2 (1) = 78 p = 005 Additionallythis effect was attenuated for more positive diffBF valuesb = 120 SD = 53 t = 227 χ2 (1) = 51 p = 02In other words differences in orthographic neighborhoodsize mattered less if the mean bigram frequency differencewas in line with marker language for German-markedPWs

Discussion

Our data replicate previous studies (Casaponsa et al2014 Thomas amp Allport 2000 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) that showed thatsublexical orthographic markers provide strong language

decision cues as orthographically marked pseudo-wordswere mostly categorized in line with their marker

Our data further extends previous findings in severalways First we show that continuous differences inlexical neighborhood size and maximal bigram frequencydifferences influence language decisions in unmarkedpseudo-words This was apparent in language attributionas well as in response latencies to neutral pseudo-words In particular response latencies increased whensublexical vs lexical levels provided conflicting languagemembership information suggesting that languagemembership information from both levels is integratedtowards a language decision Second we also find thatorthographic neighborhoods bias language decisions evenfor marked letter strings ndash as reflected by more marker-incongruent responses and slowed marker-congruentresponses for L1-marked PWs with larger orthographicneighborhoods in L2 than in L1 This effect was absentfor L2-marked PWs probably because our German-dominant participants could discard these pseudo-wordsas non-German based on their orthographic markers ndashrepresenting violations of L1 orthographic patterns ndashalone This also explains why correct categorizationsof English marked PWs were faster than respectivecorrect German categorizations (for a similar argument

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 12: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 589

see Casaponsa et al 2014 as well as Vaid amp Frenck-Mestre 2002)

Next we asked whether the effects of continuousdifferences in language similarity would be preserved in amore natural task where language decisions are necessarybut are not made explicitly To achieve this we applied thedesign of Experiment 1 to a naming task during whichlanguage categorization happens by choice of language-specific articulation (which differed between languages inour stimuli see Table S1 in supplementary online materialSupplementary Material)

Experiment 2 Naming Task

Methods amp materials

ParticipantsTwenty-four (6 male aged 18 ndash 29 mean age 24) lateGermanndashEnglish bilinguals from the same populationas the participants of Experiment 1 participated inExperiment 2 (Table 2) None of the participants ofExperiment 2 participated in Experiment 1

ProcedureIn the naming task participants were required to name(read out) pseudo-words They were told that although thePWs were unknown to them they could be read accordingto the pronunciation rules of either German or EnglishParticipants were instructed to name the pseudo-wordsas fast as possible Importantly they were instructed notto choose a pronunciation language prior to naming butrather to read them out spontaneously and intuitively Analgorithm based on zero-crossings and power estimationwas employed to register response onsets online Basedon the results of this algorithm the end of a trial wasdetermined and participants were provided with visualfeedback indicating that their response was registeredResponses were recorded with a headset microphone(Sennheiser PC 131 headset)

Results

The recordings were analyzed offline to determine naminglatencies and response language by an independenthighly-proficient GermanndashEnglish bilingual referee withthe CheckVocal software (v 222 Protopapas 2007)Another referee reanalyzed a randomly chosen subsetconsisting of 50 neutral and 50 marked pseudo-wordsfor each participant Across referee reliability wasabove 95 for naming latencies and above 85 forresponse language Stimuli design and presentationdetails were identical to Experiment 1 Naming latencies(RT) and response language were analyzed with the sameprocedures as in Experiment 1 In particular we usedthe same mixed-effects modeling approach (including

random and fixed effect structures) as in Experiment 1Outlier analyses lead to the exclusion of 35 of all trials

Effect of marker presence on response languageAs in Experiment 1 marker type (G vs E vs N) influencedthe attribution of PWs towards a language (E 79G 17 N 42 English pronunciations Table 5) χ2

(2) = 24578 p lt 001 Planned comparisons showedthat English-marked and neutral PWs were more oftenpronounced as English than German pseudo-words b =18 SD = 014 z = 121 p lt 001 and that English-marked pseudo-words were more often pronounced asEnglish than neutral pseudo-words b = 11 SD = 008z = 127 p lt 001

Effect of continuous variables in the absence ofmarkersData of three participants who named less than 10 neutralPWs in English were excluded from this analysis thusnaming of neutral PWs was analyzed based on data from21 participants

Effect of continuous variables in the absence ofmarkers Response languageSimilar to the results of Experiment 1 the probabilityfor the choice of English pronunciations increased withthe difference between English and German orthographicneighbourhood sizes b = minus025 SD = 008 z = ndash31p = 002 Additionally and different from Experiment 1the probability for the choice of German pronunciationsincreased with the German-typicality of maxdiffBFb = minus023 SD = 01 z = ndash19 p = 06 seeFigure 2 All other main effects and interactions were notsignificant

Effect of continuous variables in the absence ofmarkers Naming latenciesThe linear mixed-effects model for naming latencies toneutral pseudo-words is summarized in Table 8 Namingwas faster for PWs with more positive diffBF iehigh German-typicality at the sublexical level (Figure 5b)independently of response language (betaGerman minus24betaEnglish minus33) Naming latencies in both languageswere marginally slower when diffOLD and maxdiffBFprovided conflicting language information (Figure 5a)To elucidate the 3-way interaction of diffBF maxdiffBFand response (Figure 5c) we conducted LME modelsfor each response language separately There was nointeraction of diffBF and maxdiffBF for ldquoGermanrdquoresponses suggesting that the 3-way interaction in theomnibus LME model was driven by trials with ldquoEnglishrdquoresponses Indeed for ldquoEnglishrdquo responses the interactionof diffBF and maxdiffBF b = 344 SD = 153 t =227 χ2 (1) = 52 p = 02 reflected that the generalspeed-up of responses with increases in diffBF (ie

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 13: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

590 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Table 8 Summary of linear mixed-effects regression for reaction times to neutral PW in Experiment 2Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigram frequency differencediffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is the maximal bigramfrequency difference across all bigrams of a word

Predictor beta-estimate SD t-value χ2 p-value Significance

Intercept 74639 3327 2244 5033 lt 001 lowastlowastdiffOLD minus160 885 minus018 003 86

diffBF minus2407 1099 minus219 480 03 lowastmaxdiffBF 1902 1401 136 184 17

Response 600 579 104 107 30

diffOLD diffBF minus946 1151 minus082 068 41

diffOLD maxdiffBF 2368 1298 182 333 07 +

diffBF maxdiffBF 1908 1112 172 294 09 +

diffOLD Response minus217 559 minus039 015 70

diffBF Response minus861 690 minus125 156 21

maxdiffBF Response 1011 890 114 129 26

diffOLD diffBF maxdiffBF 1249 1299 096 093 34

diffOLD diffBF Response minus1252 727 minus172 296 09 +

diffOLD maxdiffBF Response 1361 825 165 272 10

diffBF maxdiffBF Response 1726 705 245 600 01 lowastdiffOLD diffBF maxdiffBF Response 1598 992 161 259 11

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001Responses are dummy-coded as 0 ndash German and 1 ndash English such that Response German is the reference label

more German-typical values) was reduced for positivemaxdiffBF (ie German-typical) values ndash presumablybecause particularly strong ldquoGermanrdquo evidence at onebigram position reduced the effect of the remainingbigrams

Interactions of markers and continuous variablesNaming language

Overall marking provided a very strong cue to thenaming language resulting in on average 80 of marker-congruent pronunciations (Table 5) The interaction ofdiffOLD and marker language b = 84 SD = 02z = 39 p lt001 was due to differential effects of diffOLDon errors for German-marked and English-marked PWs(Figure 4c) For German marked PWs increasingly moreEnglish than German neighbors lead to an increase inthe number of marker-incongruent responses b = ndash 06SD = 015 z = minus40 p lt 001 whereas there was nosuch effect for English-marked PWs (p gt 1) There wereno other main effects or interactions

Interactions of markers and continuous variablesNaming latencies

Results of the analysis of RTs to marked PWs aresummarized in Table 9 The main effect of diffBF was

marginally significant (b = minus2687) suggesting thatnaming was faster if mean bigrams frequencies weremore positive (ie more German-typical) similar to theeffect for neutral PWs While there were no main effectsof diffOLD and marker language their interaction wassignificant (b = ndash299) ndash as expected from Experiment1 ndash involving opposite effects of diffOLD for German-vs English-marked pseudo-words Namely naming ofGerman-marked PWs got faster with more German thanEnglish neighbors whereas naming of English-markedPWs tended to be slowed accordingly When testedseparately within each marker type the effect of diffOLDwas marginally significant for German-marked PWs b =minus133 SD = 76 t = 174 χ2 (1) = 304 p = 08 but notsignificant for English-marked PWs the latter probablydue to larger variance in participantsrsquo naming latencies toEnglish-marked PWs In summary the pattern of effectson naming latencies for correctly named marked PWswas similar to the pattern for naming language choicesas well as the RT pattern in Experiment 1 as can be seenin Figure 4d

Discussion

In Experiment 2 GermanndashEnglish bilinguals named thesame PWs that were presented for language decision inExperiment 1 In general naming in L1 was not faster

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 14: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 591

Table 9 Summary of linear mixed-effects regression for reaction times to marked PW inExperiment 2 Orthographic neighborhood size diffOLD = OLDE ndash OLDG mean bigramfrequency difference diffBF = BFG ndash BFE maximal bigram frequency difference maxdiffBF is themaximal bigram frequency difference across all bigrams of a word

Predictor beta-estimate SD t-value X2 p-value Significance

Intercept 77475 3866 2004 40160 lt 001 lowastlowastdiffOLD 1583 1230 129 166 20

diffBF minus2687 1511 minus178 316 08 +

Marker Language1 minus2521 2437 minus103 107 30

diffOLD diffBF minus1437 1281 minus112 126 26

diffOLD Marker Language minus2990 1684 minus178 315 035 1

diffBF Marker Language 1972 1911 103 107 30

diffOLD diffBF Marker Language 2056 1633 126 159 21

Note Significance of p-values + p lt 1 lowast p lt05 lowastlowast p lt 001 1one-tailed p lt051 Marker language was dummy-coded with English as baseline

700

750

800

850

German EnglishmaxdiffBF

RT

[mse

c]

a) diffOLD x maxdiffBF p =07

700

750

800

850

German EnglishNaming Language

RT

[mse

c]b) diffBF x Response

main effect of diffBF p = 03

700

750

800

850

German EnglishNaming language

RT

[mse

c]

c) diffBF x maxdiffBF x Response p = 01

maxdiffBF German English

diffOLD German English

diffBF German English

Figure 5 Visualization of the effects of differences in orthographic neighborhood size (diffOLD) maximal bigramfrequency difference (maxdiffBF) mean bigram frequency difference (diffBF) and response language on RT to neutralpseudo-words in Experiment 2 a) Interaction effect of diffOLD and diffBF b) Effect of diffBF which was independent ofthe response language c) Three-way interaction effect of diffBF maxdiffBF and response language

than naming in L2 reflecting the relatively high L2proficiency of our participants even though descriptivestatistics showed a tendency towards faster response timeswhen naming in German than in English As expectedcontinuous differences in orthographic neighbourhood

sizes as well as the language-typicality of single bigramsguided language attribution of neutral pseudo-words Wealso found that ndash in both languages ndash naming latencies forneutral pseudo-words were reduced when mean bigramfrequency was more L1-typical

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 15: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

592 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Large orthographic neighbourhoods in the non-markerlanguage lead to more errors for L1-marked PWs only Asalready discussed in Experiment 1 this is probably due toespecially high sensitivity to violation of L1 orthographicpatterns such that respective items are immediatelyassigned to L2 before orthographic neighbourhoods canaffect this decision

Overall the results of Experiment 2 further supportthe results of Experiment 1 suggesting that languagemembership information from all processing levels isintegrated during naming even in cases where categoricalsublexical evidence in form of orthographic markersis available Additionally they demonstrate that in aproduction task processing of letter strings with L1-typicalsublexical structure is facilitated

General discussion

The present study investigated the contribution of fine-grained sublexical and lexical statistical differencesbetween languages to explicit language decisions ina language decision task (Experiment 1) and implicitlanguage decisions in a naming task (Experiment 2) andcontrasted them with the effects of orthographic markersTo create a task context most sensitive to small variationsin language similarity we used pseudo-words whichhave ambiguous language membership and no semanticmeaning

Our results show that subtle differences in languagesimilarity at sublexical and lexical levels affect languageattribution and naming of language-ambiguous letterstrings corroborating bilingualsrsquo sensitivity to language-specific frequency statistics despite generally languagenon-selective processing at all stages (Costa et al 2006Jared amp Kroll 2001 Kaushanskaya amp Marian 2007Lemhoumlfer amp Dijkstra 2004)

Moreover we extend previous studies on orthographicmarkers (Casaponsa et al 2014 Vaid amp Frenck-Mestre2002 van Kesteren et al 2012) by directly showing thatin the presence of orthographic markers of L1 but not L2lexical information is assessed as continuous differencesin orthographic neighborhood sizes influenced processingof L1-marked pseudo-words Our results provide a directcomparison of the effects of cross-language lexicalneighborhood information for L1- and L2-marked PWs

Effects of differences in sublexical frequencies

We manipulated two different types of continuoussublexical information namely maximal (maxdiffBF) andmean (diffBF) bigram frequency difference

The main effect of maxdiffBF for neutral PWs persistedbeyond the effects of differences in lexical neighborhoodsizes suggesting that maxdiffBF captures a distinctsublexical source of language membership information

In fact for marked letter strings maxdiffBF captures thefrequency difference of the marker bigram opening up thequestion whether in fact orthographic markers constitutethe extremes of a continuous statistic rather than being adifferent dichotomous variable

Distributions of mean bigram frequencies in Germanand English differed only minimally in the corpusanalysis This is not surprising as it is expected that twolanguages with similar morphological structure such asGerman and English will be similar in sublexical structure(cf Marian et al 2012 for similar findings on Englishand Spanish) As expected given this high similaritybetween mean bigram frequency distributions in Germanand English this variable did not cue language decisionsHowever our data show that continuous differences inmean bigram frequencies affect response latencies Thiseffect was most prominent for neutral PWs in the namingtask but also marginally present for marked PWs in thenaming task and for neutral PWs in the language decisiontask We interpret this in terms of greater reliance on thesublexical reading route in the naming task than in thelanguage decision task as overt pronunciation of non-lexical items is required in the former but not in thelatter task (Coltheart et al 2001) The efficiency of thesublexical reading route depends on the typicality andthus frequency of involved sublexical representationsThe fact that higher L1-similarity at the sublexical levelled to shorter naming latencies suggests that mappingof L1-typical bigrams onto phonology is especiallyefficient (Gollan amp Goldrick 2012) Alternatively andwithout assuming a phonological source of this effectndash that likely requires activation of language-specificphonological units ndash the present effect may also arisewhen participants activate (language-unspecific) thoseorthographic units more quickly that are more familiarto them because they encounter them more often in their(constantly used) first than in their second language Wecannot distinguish between these options based on ourdata as in both cases greater effects in the naming taskfor neutral PWs could be due to increased reliance onsublexical representations The presence of a marginaleffect for marked PWs in the naming task only is likelydue to the fact that a single bigram (the marker) cansuffice for language decision whereas naming requiresthe complete mapping of all graphemes to phonemes(allowing for fine grained differences concerning otherthan the marker bigrams to influence naming latencies)

Note that differences between mean and maximalbigram frequencies were apparent in both tasks WhilemaxdiffBF affected language decisions and reactiontimes in the naming task diffBF only had an effect onreaction times in both tasks The effect maxdiffBF onreaction times was modulated by participantsrsquo responseSpecifically conflict between language membershipinformation stemming from maxdiffBF and diffOLD

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 16: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 593

slowed processing and agreement lead to faster responsesThis pattern is most in line with its role as a source forlanguage membership information In contrast to thisthe effects of diffBF are best interpreted in terms of anadvantage for L1-similar letter strings as high diffBFvalues speeded responses independently of the responselanguage

The different roles of diffBF and maxdiffBF in ourstudy might be due to the specific languages at handFuture research should investigate whether a comparisonof languages with more distinct orthotactic structures willyield more pronounced effects of diffBF on languageattribution

Effects of differences in lexical neighborhood sizes

Language membership categorization in both experimentswas guided by differences in orthographic neighborhoodsizes More specifically neutral pseudo-words werepreferentially categorized to the language that waspredominant in their orthographic neighborhood Thisconverges with our corpus analysis where strongdifferences between the number of within and cross-language neighbors for English and German words wereapparent (cf Shook amp Marian (2013) for similar findingson English and Spanish) It is also in line with other studiesreporting effects of within-language and cross-languageneighbors (Dijkstra et al 2010 Midgley Holcomb vanHeuven amp Grainger 2008 see also Jared 2001)

We also find that lexical neighborhood statistics playa central role in language attribution of L1-marked butnot L2-marked PWs More specifically for L1-markedPWs with more cross-language than within-languageneighbors erroneous categorizations were more frequentand correct categorizations were slowed Interestingly thiswas not the case for L2-marked PWs the processing ofwhich appeared unaffected by the number of orthographicneighbors from the two languages Vaid amp Frenck-Mestre(2002) already suggested that L2-specific orthography(violating L1 orthographic rules) allows for a ratherperceptual (ie non-lexical) language attribution strategyOur manipulation of diffOLD in marked PWs nowallows directly estimating the extent of lexical searchin the two languages during processing of marked PWResults show that indeed perception of L2 markers wassufficient to trigger language decisions while for L1-marked PWs mandatory activation of lexical neighborsseems to influence the decision process In other wordsviolation of overlearned L1 orthographic patterns offerssufficient cues for language attribution whereas the samedoes not hold for less well-represented orthographicpatterns from L2

Overall our findings provide novel evidence forthe activation of cross-language lexical neighbors forlanguage-ambiguous as well as L1-marked pseudo-

words demarcating a difference from the (lack of)effects of cross-language lexical neighbors reportedfor sentential monolingual contexts (Schwartz amp Kroll2006) Furthermore they offer an interesting insight intohow sublexical orthographic cues might modulate the ndashotherwise consistently reported ndash activation of L1 lexicalrepresentations when reading L2 Namely such cross-language activation of words from L1 might be restrictedto cases of sublexical ambiguity whereas activation oflexical representations might focus more exclusively onthe presented L2 language when orthographic patternswould be illegal in L1

Comparison of Experiments 1 and 2

Overall patterns driving language attribution werecomparable across both tasks But data from the twotasks also comprise interesting differences First whileRTs to neutral PWs in Experiment 1 were shorterfor ldquoGermanrdquo than ldquoEnglishrdquo responses this differencewas not significant in Experiment 2 Presumably forneutral PWs our GermanndashEnglish bilinguals tended torespond ldquoGermanrdquo by default but respective effectsseem to have decayed until phonological motor outputcould be produced For marked PWs however RTs toEnglish-marked PWs were significantly faster than toGerman-marked PWs in Experiment 1 while the oppositetendency was found in Experiment 2 This importantdifference aligns perfectly with specific task demands InExperiment 1 lexical neighbourhoods could be blendedout for English-marked but not for German-marked PWsndash as participants seem to have been using a sublexicalstrategy responding especially quickly to violations ofL1 orthographic patterns in the case of L2-marked PWswhich was faster than the integration of lexical andsublexical information for L1-marked PWs In the namingtask on the other hand already the need to produceless familiar L2 overt pronunciation may have cancelledout the processing advantage for L2 marked PWs ofExperiment 1

Second in Experiment 2 there were more and greatereffects of bigram frequency differences for neutralPWs than in Experiment 1 suggesting that languagedecisions in naming rely more heavily on languagemembership information from sublexical orthographicand phonological representation levels Third responsesto sublexically German-typical PWs in the naming taskwere faster independently of the response languagesuggesting that mapping to phonology is more easilyaccomplished for more L1-typical items to which ourparticipants have been more extensively exposed (Gollanamp Goldrick 2012) Fourth effects of diffOLD on RTs forneutral and marked PWs were stronger in the languagedecision task than in the naming task

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 17: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

594 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

This distinctive pattern of effects across the two tasksis well in line with the use of a sublexical grapheme-to-phoneme mapping route (Coltheart et al 2001) whichis independent of lexical activation and becomes morerelevant when phonological output has to be produced forpreviously not encountered letter strings

In general the comparison between the languagedecision and the naming task shows that languagemembership information is present at all levels of visualword processing including the phonological level Thespecific contribution of different sources of languagemembership information to language attribution appearsto depend on task strategies and output modalities Inparticular lexical language similarity has a large impactin a language decision task but its effects appearedreduced for a naming task where responses may beproduced with less activation of lexical representations(see eg Carreiras amp Perea 2004 Conrad Stennekenamp Jacobs 2006) Accordingly we suggest that duringnaming conflict in language membership information ispropagated to the phonological level where sublexical andlexical similarity statistics are integrated in an implicitlanguage decision made through the choice of the mostactive phonological units

Integration in current models of bilingual visual wordrecognition

Our data provide ample evidence in favor of languagemembership representations at the sublexical level notonly for orthographic markers but also for sharedbigrams with different frequencies in the two languagesMoreover we find that language membership informationis not only available for language-specific word-formsas shown in previous studies (Casaponsa et al 2014Vaid amp Frenck-Mestre 2002 van Kesteren et al2012) but that continuous language similarity can beinferred for non-lexical items from the activation oforthographic neighborhoods These findings challengemodels of bilingual visual word recognition to allowfor 1) continuous language membership information inaddition to dichotomic language membership cues and2) language membership information originating fromsublexical representations In particular our data supportthe recent extension of the bilingual interaction activationmodel (BIA+ van Kesteren et al 2012) that includessublexical language nodes Two features of this modelappear especially relevant with regard to the present data

First dichotomic language membership cues whichprevious studies mainly focused on are usuallyimplemented in terms of all-or-none activations oflanguage nodes by language-unique units at lexicaland sublexical levels Such dichotomic cues couldalso be implemented as language tags lsquoattachedrsquoto single language-unique representations (for early

evidence againt this concept see Grainger amp Dijkstra1992) However language tags are not compatible withcontinuous language membership information at thesublexical level language-shared bigrams cannot betagged unambiguously whereas at the lexical levellanguage tags could only have an effect after identificationof a word-form which would contradict the orthographicneighborhood effects for pseudo-words in our studyThus overall our data provide further support for theconcept of language nodes as introduced by Graingeramp Dijkstra (1992) and implemented in the BIA modeland its recent extension to the BIA+ model (van Kesterenet al 2012) Although language nodes in the BIA+ wereconceptualized for dichotomic language membershipinformation they can be extended to include the effects ofprobabilistic language membership information Namelythe strength of connections between sublexical andlexical units and language nodes could reflect thefrequency of these units in the respective language ndashmeeting computational principles of connection weightsor frequency dependent resting levels as suggested byeg Grainger amp Jacobs (1996)

Second van Kesteren and colleagues (2012) proposedto extend the BIA+ model with an additional setof language nodes accumulating language membershipinformation from sublexical representations only(ldquosublexical language nodesrdquo) Alternatively one uniqueset of language nodes might accumulate lexical andsublexical language membership information in parallelIn principle our data as well as the data of van Kesterenet al can be accommodated with either version oflanguage membership nodes However a set of languagemembership nodes activated by language informationfrom both levels of processing ndash lexical and sublexicalndash appears a more parsimonious solution which wouldincorporate the general principles of interactive activationspreading over different representation levels Futurestudies ndash including simulation studies and neuroimagingndash are required to further test respective hypotheses

Conclusions

Language membership information appears to beavailable at all levels of the visual word processingsystem It may be delivered via definitive cues suchas orthographic markers but as well via probabilisticcues such as sublexical and lexical statistics The presentstudy provides ample evidence for the processing andinteraction of language-specific sublexical and lexicalstatistics in the bilingual brain It shows that despitebilingualsrsquo generally assumed language-independent wayof processing if required probabilistic information canbe retrieved for each language separately and can be usedto guide the processing of linguistic input Future studies

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 18: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

Sublexical and lexical effects on language membership decision 595

may specify the dependence of such findings on languageproficiency and extend them to real word material

Supplementary material

For supplementary material accompanying this papervisit httpdxdoiorg101017S1366728915000292

References

Andrews S (1997) The effect of orthographic similarityon lexical retrieval Resolving neighborhood conflictsPsychonomic Bulletin amp Review 4(4) 439ndash461

Baayen R H Davidson D J amp Bates D M (2008) Mixed-effects modeling with crossed random effects for subjectsand items Journal of Memory and Language 59(4) 390ndash412 doi101016jjml200712005

Bailey T M amp Hahn U (2001) Determinants ofwordlikeness Phonotactics or lexical neighborhoodsJournal of Memory and Language 44(4) 568ndash591doi101006jmla20002756

Balota D A Yap M J Cortese M J Hutchison K AKessler B Loftis B Neely JH Nelson DL SimpsonGB amp Treiman R (2007) The English LexiconProject Behavior Research Methods 39(3) 445ndash59doi103758BF03193014

Barr D J (2013) Random effects structure for testinginteractions in linear mixed-effects models Frontiers inPsychology 4 328 doi103389fpsyg201300328

Bates D M Maechler M Bolker B amp Walker S(2014) lme4 Linear mixed-effects models using Eigenand S4 R package version 11-7 Retrieved fromhttpcranr-projectorgpackage=lme4

Brainard D H (1997) The Psychophysics Toolbox SpatialVision 10(4) 433ndash436 doi101163156856897X00357

Brysbaert M Buchmeier M Conrad M JacobsA M Boumllte J amp Boumlhl A (2011) The wordfrequency effect Experimental Psychology 58(5) 412ndash24doi1010271618-3169a000123

Carreiras M Alvarez C amp de Vega M (1993) Syllablefrequency and visual word recognition in SpanishJournal of Memory and Language 32(6) 766ndash780doi101006jmla19931038

Carreiras M amp Perea M (2004) Naming pseudowords inSpanish effects of syllable frequency Brain and Language90(1-3) 393ndash400 doi101016jbandl200312003

Carreiras M Perea M amp Grainger J (1997) Effects oforthographic neighborhood in visual word recognitioncross-task comparisons Journal of Experimental Psychol-ogy Learning Memory and Cognition 23(4) 857ndash71doi1010370278-7393234857

Casaponsa A Carreiras M amp Duntildeabeitia J A (2014)Discriminating languages in bilingual contexts the impactof orthographic markedness Frontiers in Psychology5(May) 424 doi103389fpsyg201400424

Coltheart M Rastle K Perry C Langdon R amp ZieglerJ C (2001) DRC a dual route cascaded model of visualword recognition and reading aloud Psychological Review108(1) 204ndash56

Conrad M Alvarez C Afonso O amp Jacobs A M (2014)Sublexical modulation of simultaneous language activationin bilingual visual word recognition The role of syllabicunits Bilingualism Language and Cognition in pressdoi101017S1366728914000443

Conrad M Carreiras M Tamm S amp Jacobs A M(2009) Syllables and bigrams orthographic redundancyand syllabic units affect visual word recognition at differentprocessing levels Journal of Experimental PsychologyHuman Perception and Performance 35(2) 461ndash79doi101037a0013480

Conrad M Grainger J amp Jacobs A M (2007) Phonologyas the source of syllable frequency effects in visual wordrecognition evidence from French Memory amp Cognition35(5) 974ndash83 doi103758BF03193470

Conrad M amp Jacobs A M (2004) Replicating syllablefrequency effects in Spanish in German One morechallenge to computational models of visual wordrecognition Language and Cognitive Processes 19(3)369ndash390 doi10108001690960344000224

Conrad M Stenneken P amp Jacobs A M (2006) Associatedor dissociated effects of syllable frequency in lexicaldecision and naming Psychonomic Bulletin amp Review13(2) 339ndash345 doi103758BF03193854

Costa A La Heij W amp Navarrete E (2006) The dynamics ofbilingual lexical access Bilingualism Language and Cog-nition 9(2) 137ndash151 doi101017S1366728906002495

De Groot A M B Borgwaldt S Bos M amp van den EijndenE (2002) Lexical decision and word naming in bilingualsLanguage effects and task effects Journal of Memory andLanguage 47(1) 91ndash124 doi101006jmla20012840

De Groot A M B Delmaar P amp Lupker S J (2000)The processing of interlexical homographs in translationrecognition and lexical decision Support for non-selective access to bilingual memory The QuarterlyJournal of Experimental Psychology 53(2) 397ndash428doi101080713755891

Deutsch A Frost R Pollatsek A amp Rayner K (2000)Early morphological effects in word recognition inHebrew Evidence from parafoveal preview benefitLanguage and Cognitive Processes 15(45) 487ndash506doi10108001690960050119670

Dijkstra T Hilberink-Schulpen B amp van Heuven W J B(2010) Repetition and masked form priming within andbetween languages using word and nonword neighborsBilingualism Language and Cognition 13(03) 341ndash357doi101017S1366728909990575

Dijkstra T amp van Heuven W J B (2002) The architecture ofthe bilingual word recognition system From identificationto decision Bilingualism Language and Cognition 5(03)175ndash197 doi101017S1366728902003012

Frost R Kugler T Deutsch A amp Forster K I(2005) Orthographic structure versus morphologicalstructure principles of lexical organization in agiven language Journal of Experimental PsychologyLearning Memory and Cognition 31(6) 1293ndash1326doi1010370278-73933161293

Gollan T H amp Goldrick M (2012) Does bilingual-ism twist your tongue Cognition 125(3) 491ndash7doi101016jcognition201208002

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References
Page 19: Neurocog - Interplay of bigram frequency and orthographic ...neurocog.ull.es/wp-content/uploads/2016/10/Oganian_et_al...Universidad de La Laguna, Tenerife, Spain ARASH ARYANI Department

596 Yulia Oganian Markus Conrad Arash Aryani Hauke R Heekeren Katharina Spalek

Grainger J amp Dijkstra T (1992) On the representation anduse of language information in bilinguals In R J Harris(Ed) Cognitive Processing in Bilinguals (Vol 83 pp 207ndash220)

Grainger J amp Jacobs A M (1996) Orthographic processingin visual word recognition a multiple read-out modelPsychological Review 103(3) 518ndash565

Jaeger T F (2008) Categorical data analysis Away fromANOVAs (transformation or not) and towards Logit MixedModels Journal of Memory and Language 59(4) 434ndash446 doi101016jjml200711007

Jared D (2001) Do Bilinguals Activate PhonologicalRepresentations in One or Both of Their Languages WhenNaming Words Journal of Memory and Language 44(1)2ndash31 doi101006jmla20002747

Jared D amp Kroll J F (2001) Do bilinguals activatephonological representations in one or both of theirlanguages when naming words Journal of Memory andLanguage 44(1) 2ndash31 doi101006jmla20002747

Kaushanskaya M amp Marian V (2007) Bilingual languageprocessing and interference in bilinguals Evidence fromeye tracking and picture naming Language Learning51(1) 119ndash163 doi101111j1467-9922200700401x

Keuleers E (2013) vwr Useful functions for visual wordrecognition research R package version 030 Retrievedfrom httpcranr-projectorgpackage=vwr

Kleiner M Brainard D H Pelli D Ingling A Murray Ramp Broussard C (2007) Whatrsquos new in Psychtoolbox-3Perception 36(14) 1ndash1

Lemhoumlfer K amp Broersma M (2011) Introducing LexTALEA quick and valid Lexical Test for Advanced Learnersof English Behavior Research Methods 44(2) 325ndash343doi103758s13428-011-0146-0

Lemhoumlfer K amp Dijkstra T (2004) Recognizing cognatesand interlingual homographs effects of code similarity inlanguage-specific and generalized lexical decision Mem-ory amp Cognition 32(4) 533ndash50 doi103758BF03195845

Lemhoumlfer K Dijkstra T Schriefers H J Baayen R HGrainger J amp Zwitserlood P (2008) Native languageinfluences on word recognition in a second languagea megastudy Journal of Experimental PsychologyLearning Memory and Cognition 34(1) 12ndash31doi1010370278-739334112

Li P Sepanski S amp Zhao X (2006) Language historyquestionnaire A web-based interface for bilingualresearch Behavior Research Methods 38(2) 202ndash10doi103758BF03192770

Libben M amp Titone D (2009) Bilingual lexical access incontext evidence from eye movements during readingJournal of Experimental Psychology Learning Memoryand Cognition 35(2) 381ndash390 doi101037a0014875

Marian V Bartolotti J Chabal S amp Shook A(2012) CLEARPOND cross-linguistic easy-access re-source for phonological and orthographic neighborhooddensities PloS One 7(8) e43230 doi101371jour-nalpone0043230

Midgley K J Holcomb P J van Heuven W J B amp GraingerJ (2008) An electrophysiological investigation of cross-language effects of orthographic neighborhood Brain Re-search 1246 123ndash35 doi101016jbrainres200809078

Moll K amp Landerl K (2010) SLRT-IIndashVerfahren zurDifferentialdiagnose von Stoumlrungen der Teilkomponentendes Lesens und Schreibens Bern Hans Huber

Protopapas A (2007) CheckVocal a program to facilitatechecking the accuracy and response time of vocal responsesfrom DMDX Behavior Research Methods 39(4) 859ndash62doi103758BF03192979

Schulpen B Dijkstra T Schriefers H J amp Hasper M (2003)Recognition of interlingual homophones in bilingualauditory word recognition Journal of ExperimentalPsychology Human Perception and Performance 29(6)1155ndash78 doi1010370096-15232961155

Schwartz A I amp Kroll J F (2006) Bilingual lexical activationin sentence context Journal of Memory and Language55(2) 197ndash212 doi101016jjml200603004

Shook A amp Marian V (2013) The Bilingual languageinteraction network for comprehension of speechBilingualism Language and Cognition 16(2) 304ndash324doi101017S1366728912000466

Thomas M S C amp Allport A (2000) LanguageSwitching Costs in Bilingual Visual Word RecognitionJournal of Memory and Language 43(1) 44ndash66doi101006jmla19992700

Torgesen J K amp Rashotte C A (1999) TOWREndash2 Test ofWord Reading Efffciency

Vaid J amp Frenck-Mestre C (2002) Do orthographic cuesaid language recognition A laterality study with French-English bilinguals Brain and Language 82(1) 47ndash53doi101016S0093-934X(02)00008-1

Van Kesteren R Dijkstra T amp de Smedt K (2012) Marked-ness effects in Norwegian ndash English bilinguals Task-dependent use of language- specific letters and bigramsThe Quarterly Journal of Experimental Psychology 65(11)2129ndash54 doi101080174702182012679946

Westbury C amp Buchanan L (2002) The Probability ofthe Least Likely Non-Length-Controlled Bigram AffectsLexical Decision Reaction Times Brain and Language81(1-3) 66ndash78 doi101006brln20012507

Yarkoni T Balota D A amp Yap M J (2008) Movingbeyond Coltheartrsquos N a new measure of orthographicsimilarity Psychonomic Bulletin amp Review 15(5) 971ndash9doi103758PBR155971

httpdxdoiorg101017S1366728915000292Downloaded from httpwwwcambridgeorgcore Universidad de la Laguna on 09 Oct 2016 at 094719 subject to the Cambridge Core terms of use available at httpwwwcambridgeorgcoreterms

  • Introduction
    • The present study
      • Corpus analysis
        • Methods
          • Orthographic neighborhood size
          • Bigram frequencies
            • Results and discussion
              • Experiment 1 Language Decision Task
                • Methods amp materials
                  • Participants
                  • Stimuli amp Design
                  • Neutral pseudo-words
                  • Marked pseudo-words
                  • Procedure
                  • Statistical analysis
                    • Results
                      • Effect of marker presence on language decisions
                      • Effects of continuous variables in the absence of markers Language decision
                      • Effects of continuous variables in the absence of markers Response times
                      • Interactions of markers and continuous variables3 Language decision
                        • Interactions of markers and continuous variables Response times
                        • Discussion
                          • Experiment 2 Naming Task
                            • Methods amp materials
                              • Participants
                              • Procedure
                                • Results
                                  • Effect of marker presence on response language
                                  • Effect of continuous variables in the absence of markers
                                  • Effect of continuous variables in the absence of markers Response language
                                  • Effect of continuous variables in the absence of markers Naming latencies
                                    • Interactions of markers and continuous variables Naming language
                                    • Interactions of markers and continuous variables Naming latencies
                                    • Discussion
                                      • General discussion
                                        • Effects of differences in sublexical frequencies
                                        • Effects of differences in lexical neighborhood sizes
                                        • Comparison of Experiments 1 and 2
                                        • Integration in current models of bilingual visual word recognition
                                        • Conclusions
                                          • Supplementary material
                                          • References