+ general introduction to needs in the field now: needs of multilingualism in language acquisition...

58
+ General Introduction to Needs in the Field Now: Needs of multilingualism in language acquisition Development of Linguistic Linked Open Data (LLOD) Resources for Collaborative Data-Intensive Research in the Language Sciences Saturday, July 25th, 2015 María Blume Isabell e Barrier e Cristina Dye Cariss a Kang for the VCLA LSA Linguistic Summer Institute 2015 U. Of Chicago & Jonathan Masci

Upload: irene-brooks

Post on 25-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

+General Introduction to Needs in the Field Now:

Needs of multilingualism in language acquisition

Development of Linguistic Linked Open Data (LLOD) Resources for Collaborative Data-Intensive Research in the Language SciencesSaturday, July 25th, 2015

MaríaBlume

Isabelle Barriere

CristinaDye

Carissa Kang

for the VCLA

LSALinguistic Summer Institute 2015U. Of Chicago

& Jonathan Masci

+Main goals

Requirements for conducting research with multilingual populations.

Challenges for the development of Linked Open Data (LOD) in the field of multilingual acquisition.

Capacities and needs of any primary research tool that would allow us to achieve the vision of LOD.

+Importance of research with multilingual populations

Multilinguals constitute the majority of the world population and a growing proportion of the US population (e.g. McCabe et al., 2013)

Theories of language and cognitive models of language development must account for language use, processing and acquisition in multilinguals.

+Main methodological issues

i. Criteria for participant recruitement.

ii. Assessment of the degree of multilingualism of the participants.

iii. Issues of working with multilingual data.

Challenges for the data markup that must become part of the metadata and the data which is the object of study.

+Problems

There is no clear single definition of bilingualism, (Grosjean 2010, Mackey 2012)

The amount of individual variation in the many factors that affect language acquisition and use makes it very difficult to compare participants.

+Problems

Nevertheless, research on bilingualism (its cognitive effects, its relationship with literacy, comparisons with L1 acquisition), are based on having established the nature of the participant’s bilingualism.

+

Participant recruitement

+Participant recruitement

Main issue: Large variability of bilingual speakers.

Need to classify them so that we know what data are comparable.

Need to gather extensive metadata

+Typical situations that involve bilingualism

Group A: Children who are bilingual from birth either having two languages at home or one at home and one outside the home

Group B: Early bilinguals who start learning a second language sometime after birth, normally having one language at home and one outside of home.

Group C: People who learned a second language in adulthood and use it for work-related activities,

Group D: Immigrants who must learn a second language to survive in the new country or speakers of a minority language in their own country who must learn the dominant language.

+Main issues related to choosing and classifying bilingual speakers

• Insufficient factors are considered when bilingual speakers are selected for a study (e.g. speakers are classified according to age of acquisition, but patterns of use are not considered).

+Main issues related to choosing and classifying bilingual speakers

• Different researchers rely on different criteria when selecting the participants of their study and use various assessment tools to collect information about the speaker’s language proficiency and language history. This causes that different groups of bilinguals are analyzed in different studies. (Grosjean, 2008).

+Main issues related to choosing and classifying bilingual speakers

• There are no clear standards for classifying bilingual/L2 speakers across studies, and therefore it is impossible to tell if the results of a given study will generalize to other groups of bilingual/L2 speakers.

+Possible Solutions

Researchers should provide as much information as possible about the participants in their studies, the criteria used to classify bilinguals in different groups and the language assessment tools used in the study.

+What should one document? Grosjean 2008• Language history and language relationship:

Which languages were acquired (when and how)? What was the pattern of language use?

• Language stability: Are one or several languages still being acquired? Has a certain language stability been reached?

• Function of languages: Which languages are used for what purposes, in what context and to what extent?

+What should one document? Grosjean 2008• Language proficiency: What is the bilinguals

proficiency in the four skills in each language?• Language modes: How often and for how long is a

bilingual in monolingual and bilingual mode? When in a bilingual mode, how much code-switching and borrowing takes place?

• Biographical data: What is the bilingual’s sex, age, socio-economic status, etc.

+A minimal questionnaire

+The Multilingualism Questionnaire

The VCLA members have created a much more extensive questionnaire.

+DTA subject and session screens

Detailed data

Multilingualism questionnaire is just an attachment, so not searchable.

+Subject screen

ID, name and gender

DOB

Nationality, ethnicity, place of birth.

Language or cognitive impairment.

Info on Human Subjects requirements.

Multilingualism questionnaire completed.

Contact info for subject

Comments.

Language(s), dialect(s), and comprehension and production levels in each language.

Info on caregivers.

+Session metadata

Session ID Date Interviewer Assistants Length of session Tasks Lenguages used Session location Subject

Age Number and position among siblings Address, length of residency Education, occupation, schoolescuela

Name and ID in transcription for other participants in session.

General activities Analysis performed on data

+Sample size

Since bilingual speakers vary so much, it is difficult to find speakers who have the same characteristics.

Research studies involving bilingual populations usually have small sample sizes.

A tool that allows for meta-analysis studies of would be enormously beneficial.

+Language and language variety detailed information

Research in multilingual populations frequently involves working with better know Indo-European languages as well as lesser studied languages such as Haitian Creole, Yiddish and Quechua.

In this type of research, as in cross-linguistic studies, there is an additional need to include detailed and calibrated information on the language variety so that cross-linguistic development can be compared.

However, such a capability requires individual researchers to store and analyze data in compatible ways.

+Assessment of the degree of multilingualism of the participants

+The Definition Problem

With respect to their language, bilinguals are usually classified in terms of:language knowledgeamount of use of both languagesdomains of use of both languages and their

skills (reading and writing).However, someone’s level of command may change in all these areas throughout his/her lifetime; so bilinguals need to be assessed at different points in their development.

+How is language competence/proficiency assessed?

Different studies and labs use different tests or instruments to measure the degree of bilingualism.

To enable comparison across multilinguals, we need data on how their level of bilingualism was determined: Specific measures Specific tests Task modality (e.g., comprehension or production) The linguistic domain tested (e.g., vocabulary, grammar).

+Different experimental designs and coding systems

One needs to be able to compare results of different extensive markup systems indicating The timing of exposure to relevant stimuli (e.g. the

point at which a child hears verbal stimuli when presented with visual stimuli in a picture/video-matching task

The data source (total looking time versus first long gaze in an Intermodal preferential looking paradigm).

+Problems

Many evaluations of bilingualism in children are based only in parental report through questionnaires (Gutierrez-Clellen & Kreiter 2003; Squires, Bricker, &Potter, 1997; Thordardottir & Weismer, 1996).

+Lust, Flynn, Blume, Park, Kang, Yang, & Kim 2014.

This study showed that it is fundamental to evaluate children’s competence directly as a complement to parental report.

We compared two Korean (L1), English (L2) bilingual four-year-olds who participated in different case studies at Cornell and MIT.

Parental report and general linguistic history showed the children were very similar.

However, results of studies using the Elicited Imitation task showed differences in the children’s language production that the questionnaires did not predict.

+Level of bilingualism according to questionnaire.Korean was the main language at home.They used Korean 80% of the time and English 20% of

the time.They were both sequential bilinguals.They were both Korean dominant. They undestood

and produced Korean with more proficiency than English.

They felt more comfortable in Korean than in English in all contexts.

+Level of bilingualism according to questionnaire.

Both mothers judged the child proficiency as being in level 2 out of 4 for English and 4 out of 4 for Korean.

+Differences between the children according to the questionnaire.

MJ spent 40 hours a week in daycare, where he was exposed to and spoke English only.

CH spent only 9 hours a week in daycare, where he used English el 80% of the time and Korean 20%.

Scale of linguistic abilities in comprehension and producción (range 1 to 6): CH had a 6 in Korean and 3

in English. MJ had a 6 in Korean and 5

in English.

+Results

MJ imitated all types of coordinate structures perfectly.

CH only had 50% correct in both types of coordination.

A large proportion of CH’s errors where due to the omission of Korean Case markers (-ul/-lul)

+Issues of working with multilingual data.

+Cross-linguistic data

Well-studied Indo-European languages as well as less-studied languages such as Haitian Creole, Yiddish and Quechua.

Include detailed and calibrated information on the language variety so that cross-linguistic development can be compared.

This requires individual researchers to store and analyze data in compatible ways.

+Enabling cross-linguistic comparisons requires rich markup capacity.

Example: acquisition of relative clauses across languages

English relative clause conversion in type:

(1) Experimental stimulus (lexically headed verb form):

Big Bird pushes the balloon which bumps Ernie.

(2) Child structure (free relative):

Big Bird pushes what bumps Ernie

+ Enabling cross-linguistic comparisons requires rich markup capacity.

French relative clause conversion in type:

(3) Experimental stimulus (lexically headed relative clause):

a. Aladdin choisit la chose que Fifi achète

b. Aladdin choose-3S the.FEM thing that Fifi buy-3S

c. ‘Aladdin chooses the thing that Fifi buys.’

(2) Child structure (free relative) (age 4;2)a. Aladdin choisit ce que Fifi achète

b. Aladdin choose-3S ce that Fifi buy-3S

c. ‘Aladdin chooses what Fifi buys.’

37+Enabling cross-linguistic comparisons requires rich markup capacity.

(5) Experimental stimulus (correlative form) verb form:a. …maltenaa …b. …mal- t- e- n- aac. …make- past- (3.masc.sg) n- Q

(6) Tulu relative clause form (by child, 3;2, verbal adjective form with null head)

a. pada paND(i)na porl(u)ullallb. pada ti paN- D(i)- na ti porl(u)ullaali

c. song ti say- past- rel ti beautiful.be.3.sg.femd. ‘She who sang is beautiful.’

(Somashekar 1999, 216)

+Enabling cross-linguistic comparisons requires rich markup capacity.

Enabling cross-linguistic comparisons requires rich markup capacity. Different languages require different data markup. Coding must account for not only single data fields, but

relations among them.

Addressing this challenge requires a tool that offers a wide range of markup and coding options that are nevertheless standardized as much as possible to permit comparisons across languages.

39+What does the tool need?

Standardization with flexibility

Linkage across projects and data sets

Efficient data capture

Inventory of fields informed by past research

Capacity to query fields and relations among fields

+Code-switched, code-mixed data

Systems dealing with multilingual data:

Identify the languages at multiple levels.

Mark instances where determining the language is not possible.

+Study on Code-Switching & Attention: Adults & Children

Relationship between Code-switching (CS) and attention in English-Mandarin bilinguals Adults and children How does depleting attention affect subsequent CS?

Measuring CS in an experimental context Transcribe Reliability checks

Language background/CS practices Organizing critical questionnaire data

+Study on Code-Switching & Attention: Adults & Children

Topic: Describe your ideal vacation

Uh... I, uh... definitely want to go taste some good wines from Europe, uh from Germany and France maybe, but uh they're all pretty close, so it should be, I should be able to drive around in these countries. Uh, I definite... want someone to

<BEEP>

如果有人跟我一起去度假会很好,总之我希望我的假期可以安排得比较

<BEEP>

It’s more about good people you hangout with so like the people u come to vacation so it’s not about the it’s it’s about the view but it’s more about the people and as long as like you have good like mental state

<BEEP>

+Challenges:Working with CS Data

1. CS transcripts Different transcribing methods Issues with transcribing

E.g., What constitutes as filler words?

2. Meta-data Useful to have each participant’s language

background information on hand

+Challenges:Working with CS Data

Advantages of sharing data:

i. Facilitates discussion on issues pertaining to transcribing

ii. Making full use of data: People from different fields may be

interested in different aspects of the data

iii. Replicating study

+Conclusions

In sum, an LOD perspective and any primary research tool which would aid researchers to achieve linking of their data in the study of multilingualism would require a cyberinfrastructure to support collaborative cross-linguistic research, and calibration of complex multilingual markup systems.

+Acknowledgments

National Science Foundation Barbara Lust. 2015. Workshop: Development of Linguistic Linked Open

Data Resources for Collaborative Data-Intensive Research in the Language Sciences; University of Chicago, July 2015, under the direction of Barbara Lust, María Blume, Antonio Pareja-Lora. Award ID 1463196,

María Blume and Barbara Lust. 2008. Transforming the Primary Research Process Through Cybertool Dissemination: An Implementation of a Virtual Center for the Study of Language Acquisition. NSF OCI-0753415

Janet McCue and Barbara Lust 2004-2006. National Science Foundation Award: Planning Information Infrastructure Through a New Library-Research Partnership. (SGER=Small Grant for Exploratory Research)

Lust, Barbara. 2003. Planning Grant: A Virtual Center for Child Language Acquisition Research. National Science Foundation. NSF BCS-0126546

+Acknowledgments

American Institute for Sri Lankan Studies, Cornell University Einaudi Center.

Cornell University Faculty Innovation in Teaching Awards, Cornell Institute for Social and Economic Research (CISER).

New York State Hatch grant.

The Cornell Institute for Social Science

The Cornell Cognitive Science Program

+Acknowledgments

Funding for Dr. Barriere: NSF, USA/BCS#1251828 and 1251707 awarded to I. Barrière and G. Legendre; ESRC, UK; PSC-CUNY

+AcknowledgmentsIsabelle Barrière’s Collaborators

Jennifer Culbertson, U. Edinburgh, Scotland Nayeli Gonzalez-Gomez, Oxford Brooks, UK Lisa Hsin, Tufts U., Boston, USA Sarah Kresh, Graduate Center CUNY, USA Geraldine Legendre, Johns Hopkins U., USA Gary Morgan, City U. London, UK Thierry Nazzi, U. Paris V & CNRS, France Bencie Woll, U. College London, UK Erin Zaroukian, Johns Hopkins U., USA

50+AcknowledgmentsVCLA Founding Members

Suzanne Flynn, MIT Claire Foley, Boston College. Marianella Casasola, Claire Cardie, James Gair, and Qi

Wang, Cornell University. Liliana Sánchez, Rutgers University at New

Brusnwick YuChin Chien, California State University at San

Bernardino Usha Lakshmanan, Southern Illinois University at

Carbondale Elise Temple, NeuroFocus Jennifer Austin, Rutgers University at Newark

51+Acknowledgments

VCLA affiliates: City University of New Yors: Gita Martohardjono, and

Valerie Shafer Ben Gurion University at the Negev: Yarden Kedar Tyndale University College and Seminary: Sujin Yang Columbia University: Joy Hirsch University of California at San Diego: Sarah Callahan Kyungsung University: Kwee Ock Lee Central Institute of English and Foreign Languages:

R. Amritavalli Osmania University: A. Usha Rani

52+Acknowledgments

Our application developers Ted Caldwell and Greg Kops (GORGES).

Our consultants Cliff Crawford and Tommy Cusick.

Our student RAs: Darlin Alberto, Gabriel Clandorf, Natalia Buitrago, Poornima Guna, Jennie Lin, and Marina Kalashnikova. Martha Rayas Tanaka,

Lizzeth Pattison, María Jiménez, and Mónica Martínez at UTEP.

The students at all the participating institutions who helped us with comments and suggestions.

+References

Canale, M. (1983). From communicative competence to communicative language pedagogy. In Language and communication, eds. J. C. Richards and R. W. Schmidt. London: Longman, pp. 2-27.

Canale, M. and Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1: 1-47.

Esquinca, A., Yaden, D., & Rueda, R. (2005). Current language proficiency tests and their implications for preschool English language learners. In J. Cohen, K. T. McAlister, K. Rolstad, & J. MacSwan (Eds.), Proceedings of the 4th International Symposium on Bilingualism (pp. 674-680). Somerville, MA: Cascadilla Press.

+References

Flege, J., MacKay, I., & Piske, T. (2002). Assessing bilingual dominance. Applied Psycholinguistics, 23, 567-598

Flynn S, Lust B (1981). Acquisition of relative clauses. Developmental changes in their heads.  In: Harbert W, Herschensohn J (eds) Cornell Working Papers in Linguistics 2. Department of Modern Languages and Linguistics, Cornell University, Ithaca, pp 33-45.

Foley C (1996) Knowledge of the syntax of operators in the initial state: The acquisition of relative clauses in French and English. Dissertation, Cornell University

Gathercole, V. C. M. (2010.) Bilingual children: Language and assessment issues for educators. In K. Littleton, C. Wood, & J. Kleine Staarman (Eds.), International handbook of psychology in education (pp. 713-748). Bingley, U. K.: Emerald Group.

+References

Genesee, F. (1989). Early bilingual development: One language or two. Journal of Child Language, 16, 161-179. Reproduced in L. Wei (Ed.), The Bilingual Reader (pp. 327-343). London, U.K.: Routledge. 

Genesee, F., Nicoladis, E., & Paradis, J. (1995). Language differentiation in early bilingual development. Journal of Child Language, 22, 611-631.

Grosjean, François. (2008). Studying bilinguals. Oxford: Oxford University Press.

Grosjean, F. (2010). Bilingual: Life and reality. Cambridge, MA: Harvard University Press

+References

Gutierrez-Clellen, V., & Kreiter, J. (2013.) Understanding child bilingual acquisition using parent and teacher reports. Applied Psycholinguistics, 14, 267-288

Kim, Y.-J. (1997). The acquisition of Korean. In D. Slobin (Ed.), The cross-linguistic study of language acquisition, vol. 4. (pp. 335-443). Hillsdale, NJ: Lawrence Erlbaum.

Lust, B.; Flynn, S.; Blume, M.; Park, S.; Kang, C.; Yang, S., & Kim, A. (2014). Assessing child bilingualism: Direct assessment of bilingual syntax amends caretaker report. The International Journal of Bilingualism, pp. 1-20.

+References

Mackey, W. (2012). Bilingualism in North America. In T. Bhatia & W. Ritchie (Eds.), Handbook of bilingualism and multilingualism (pp. 707-724). Oxford, UK: Blackwell.

McCabe, A., Tamis-LeMonda, C.S., Bornstein, M.H., Brockmeyer Cates, C., Golinkoff, R., Guerra, A.W., Hirsh-Pasek, K., Hoff, E., Kuchirko, Y., Melzi, G., Mendelsohn, A., Páez, M. Song, L., (2013) Multilingual children beyond myths and towards best practices. Social Policy Report, Sharing Child and Youth Development Knowledge, 27 (4), 1-37.

Paradis, J., Emmerzael, K., & Duncan, T. (2010). Assessment of English language learners: Using parent report on first language development. Journal of Communication Disorders, 43, 474-497.

Pease-Álvarez, L., Hakuta, K., & Baylery, R. (1996). Spanish proficiency and language use in California’s Mexicano community. Southwest Journal of Linguistics, 15, 137-51.

+References

Somashekar, Shamitha. (1999). Developmental trends in the acquisition of relative clauses: Cross-linguistic experimental study of Tulu. Ph.D. dissertation, Cornell University.

Squires, J., Bricker, D., & Potter, L. (1997). Revision of parent-completed developmental screening tool: Ages and stages questionnaire. Journal of Pediatric Psychology, 22, 313-328.

Thordaardottir, E., & Weismer, S. (1996). Language assessment via parent report: Development of screening instrument for Icelandic children. First Language, 16, 265-285.