folkert de vriend & martin snijders 18/11/2011
DESCRIPTION
Bridging the Gap between First Language Acquisition and Historical Dialectology with the Help of Digital Humanities. Folkert de Vriend & Martin Snijders 18/11/2011. Time and team. Project duration: 1 year (may 2011 - may 2012) Multi-disciplinairy team: Leonie Cornips Wilbert Heeringa - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/1.jpg)
Bridging the Gap between First Language Acquisition and
Historical Dialectology with the Help of Digital Humanities
Folkert de Vriend & Martin Snijders18/11/2011
![Page 2: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/2.jpg)
Time and team• Project duration: 1 year (may 2011 - may 2012)
• Multi-disciplinairy team:o Leonie Cornipso Wilbert Heeringao Marc Kemps-Snijderso Martin Snijderso Student assistants: Anke, Gertruud, Yvonneo Jos Swanenbergo Folkert de Vriend
![Page 3: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/3.jpg)
• COAVA: COgnition, Acquisition and VAriation Tool
• Aims of COAVA:A) Curation of resources from two separate linguistic subdisciplines: first language acquisition and dialect geography. B) Development of a demonstrator tool for interdisciplinary research into the lexical characteristics of concepts
General
![Page 4: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/4.jpg)
A) Curation
![Page 5: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/5.jpg)
Resources in COAVA• Seven corpora from CHILDES
• The Netherlands and Flanders• Children (mostly between 2 and 3,5 years)
• Part III of WBD/WLD• (Dutch and Flemmish) Brabant and Limburg• Adults
![Page 6: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/6.jpg)
CLARIN-complianceDialect data and CHILDES data• CMDI-metadata• Persistent identifiers• ISOcat
Dialect data• Lexical Markup Framework (LMF)
![Page 7: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/7.jpg)
B) Demonstrator
![Page 8: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/8.jpg)
Lexical characteristics• First language acquisition:
For some concepts the lexical form typically is acquired early (‘dog’ for instance) while for other concepts the lexical form typically is acquired later (‘blue titmouse’ for instance.).’
• Dialect geography:For some concepts there is lot of lexical variation while for other concepts there is very little variation.
![Page 9: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/9.jpg)
Value of combined interpretation
•For researchers in both disciplines these characteristics are interesting for at least two reasons:•Research into the ‘basic level
vocabulary’ of a community•Research into the relation
between age of acquisition and (dialect)variation
![Page 10: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/10.jpg)
Implementation• A concept taxonomy is constructed. This
taxonomy will only contain concepts for which lexical forms can be found in both resources
• Since the Dutch CHILDES data mostly contain data for children aged between 2 and 3,5 years of age we focus on lexical forms that are nouns.
• To enable linking from this taxonomy to the CHILDES data, these first need to be lemmatised and tagged for their POS (Lexicon by Gilles)
![Page 11: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/11.jpg)
Demo
![Page 12: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/12.jpg)
Technology
• Client server application• Search services
• Java/Google Web Toolkit• Apache/Tomcat• Solr search server• Open Source
![Page 13: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/13.jpg)
Solr
• Indices, multi core• Facetted search• Fast
![Page 14: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/14.jpg)
Demo
![Page 15: Folkert de Vriend & Martin Snijders 18/11/2011](https://reader035.vdocument.in/reader035/viewer/2022062810/56815cda550346895dcae2b4/html5/thumbnails/15.jpg)
Thank you