technological approaches to linguistic documentation and meta-documentation
TRANSCRIPT
April 12, 2023 1
Technological Approaches to Linguistic Documentation
and Metadocumentation
Pankaj Dwivedi
Gulab Chand
Somdev Kar
Indian Institute of Technology Ropar
Rupnagar, Punjab 140001
India
April 12, 2023 2
Language Documentation
Principles and methods used for the recording and analysis of primary language and cultural materials, and metadata about them.
Unlike before, with the revolution in the area of information technologies, it is now possible to maintain organized and long-lasting linguistic and cultural records.
April 12, 2023 3
Why documenting languages is IMPORTANT?
Half of the world’s language may no longer to continue to exist after a few more generations as they are not being learnt by children as first languages (Austin & Sallabank, 2011).
Crystal (2002) claims that the rate of language disappearance is as high as two languages each month.
April 12, 2023 4
How ?
Creating Dictionaries
Preparing Language Teaching Materials
Archiving
Language Corpora (Written & Spoken)
April 12, 2023 5
What is needed?
Lot of language data and latest technology
Language data: Text, Audio and Video
Technology: software and tools which can handle the language data and platforms wherein these data can be effectively made use of.
April 12, 2023 6
What do we need?
Language data ( No Problem)Platforms (will see later on)Latest TOOLS and SOFTWARE for:
1. Recording and Capturing
2. Analysis
3. Archiving
4. Mobilization
April 12, 2023 7
ONE MOMENT!!!
Is ‘Latest’ the best?
or
Old is gold?
CHOOSE CAREFULLY !!!
April 12, 2023 8
Is ‘TECHNOLOGY’ adoption always good?
Languages may live on without orthography. But no language will be able to function as administrative language in a modern society without a developed language technology (Trosterud, 2006).
Technology changes quickly and an uncritical adoption of new tools and technologies might compromise with long-term sustainability, portability, usability and compatibility with other platforms (Bird & Simons, 2003).
April 12, 2023 9
Striking a balance
Portability: operating systems, formats, software, encodings
Sustainability: long-term preservation and usefulness
Maintenance and Distribution: finances, space, tools and reach
Access and protocols: paid or free, open or closed, research or business, full or partial
April 12, 2023 10
Capturing Audio Media
April 12, 2023 11
Why or Why not WAV?
April 12, 2023 12
Capturing Video Media
CODECS
April 12, 2023 13
CONTAINERS
April 12, 2023 14
Capturing Digital Text
Character Encoding: Unicode, ASCII, Windows/ANSI, Big5, Latin 5 etc.
Data Encoding: XML, SGML, MSWord etc.
File Encoding: plain-text, PDF, MSWord etc.
April 12, 2023 15
Digital text: An overview
April 12, 2023 16
Analysis tools
Transcription
Annotation
Translation
Metadata Management
April 12, 2023 17
Popular Tools
April 12, 2023 18
Metadata Management
Cataloguing: title, speakers, collectors, time and place, language name etc.
Descriptive: information about content, relationship to other content etc.
Structural: structures and patterns Technical: description of formats, encoding,
required tools and softwareAdministrative: work log, access protocol etc.
(Nathan &Austin, 2004)
April 12, 2023 19
Platforms
1. Online Language Archives:
Examples:OLAC, ANLA, ELAR, CLA, The Language Archive, PARADISEC etc.
2. Social Media: Facebook, Twitter, Blogs, etc.
Examples: ‘Indigenous Tweets’ and ‘Facebook in your language’ by Prof. Kevin Scannell
April 12, 2023 20
Conclusion
In the generation when the rate of language death is at its peak, if we choose to use moribund technologies to create and preserve language data, when technologies die, unique heritage is also lost or encrypted (Bird & Simons, 2003).
We must keep in mind:
Purpose, Presentation, Portabilityand
Preservation
April 12, 2023 21
ReferencesAustin, P., & Sallabank, J. (Eds.) (2011). The Cambridge
handbook of endangered languages. Cambridge University Press
Bird, S., & Simons, G. (2003). Seven dimensions of portability for language documentation and description. Language, 79(3), pp. 557-582
Crystal, D. (2002). Language death. Cambridge University Press.
Nathan, D., & Austin, P. (2004). Reconceiving metadata: language documentation through thick and thin. Language documentation and description, 2, 179-187.
April 12, 2023 22
Trosterud, T. (2006). Grammatically based language technology for minority languages. TRENDS IN LINGUISTICS STUDIES AND MONOGRAPHS, 175, 293.
April 12, 2023 23
Thank You!Questions and Feedback.