july 11, 2003e-meld 2003 e-meld “school” of best practice helen aristar-dry & gayathri...
TRANSCRIPT
July 11, 2003 E-MELD 2003
E-MELD“School” of Best Practice
Helen Aristar-Dry & Gayathri SriramThe LINGUIST List
Eastern Michigan University
July 11, 2003 E-MELD 2003
Overview
• The E-MELD ‘School’ of Best Practice: latest version– Purpose
•What is ‘best practice’?•Why ‘best practice’?
– Organization
• Demo some of the facilities
July 11, 2003 E-MELD 2003
A note about the name…
• Showroom of BP? …..Nope, it’s got rooms. • House of BP?• Funhouse?• Playhouse?• Outhouse?• Bazaar?• Palace?• Chateau?• Shed?
School
July 11, 2003 E-MELD 2003
What is Best Practice?
Practices designed to insure that digital language resources :
• endure through time. • can be reused by others, both now and
in the future.• are as independent as possible of
computer environments, scholarly communities, and domains of application.
-Bird & Simons 2003
July 11, 2003 E-MELD 2003
Best Practice as we know it …
• Distinguish between the archival format and the presentation format(s). BP is concerned primarily with archival format.
• Archival formats should employ open file formats and open standards.
• Examples of archive formats: – Documents: plain text with XML markup.
– Images: TIF 16 bit gray scale format
– Audio files: pure (uncompressed) WAV files.
. . . this afternoon
July 11, 2003 E-MELD 2003
Best Practice
• Write metadata for the language resource in an approved format.
Recommended: •OLAC format
•A format mapped to OLAC, e.g., IMDI
• Make the metadata available to a general search engine.
Recommended:•An OLAC service provider, e.g.
LINGUIST List
July 11, 2003 E-MELD 2003
Best Practice
• For morphosyntactic markup: countenance different terminology sets but use an ontology of linguistic concepts (GOLD) as an interlanguage
• Relate the different morphological markup schemas to the ontology by means of a metaschema.
July 11, 2003 E-MELD 2003
Why Best Practice?
“Best practice is enduring practice” (Simons, bc)
BP is important for all language documentation . . .
. . . but especially for documentation of endangered languages
July 11, 2003 E-MELD 2003
Why Best Practice?
•According to the Ethnologue, 52 languages have only 1 speaker left.
•Somewhere 52 field linguists are making audiotapes, videotapes, and transcripts….
July 11, 2003 E-MELD 2003
What if . . .
–Ten are transcribing in MS Word 6(which probably won’t be readable in 15 years )
July 11, 2003 E-MELD 2003
What if . . .
–Ten more are using compressed audio formats? (and compressing away some of the data)
July 11, 2003 E-MELD 2003
So the School is designed to
• Help users preserve their valuable data for generations to come.– Data:
•Notes• Images•Audio & video
– Users: • linguists, programmers, archivists• (digital) beginners or advanced users
July 11, 2003 E-MELD 2003
Ob jectives:
•Teach•Motivate•Facilitate•Invite (suggestions & participation)
July 11, 2003 E-MELD 2003
What will the School offer?
– Information about the preservation and digitization of data
– Tutorials to provide hands-on training – Facilities for online operations on the
linguist’s own data, i.e., creation of metadata
– Tools (and links to tools) for client-side operations, i.e., text annotation
– Reading material about various aspects of BP
– showcase of data from 10 endangered languages digitized according to BP
July 11, 2003 E-MELD 2003
How is the School organized?
– Information– Tutorials– Online facilities– Client-sideTools– Reading material – Showcase of data from 10 endangered languages
Classroom
Workroom
Tool Room
Reading Room
Exhibit Hall
July 11, 2003 E-MELD 2003
The Exhibit Hall
Purpose: to show what can be done within the BP framework
• Data (currently) from Biao Min and Mocovi
• Info on the language(s)• Biao Min lexicon & metadata
– Archive formats– Presentation formats (with some audio)
• Search: cross-language search at a fine-grained morphosyntactic level (thanks to the ontology)
• Comments facility for users• What else?
July 11, 2003 E-MELD 2003
Classroom
Teach users how to:– choose equipment & software– create metadata and make it available
for search– create an XML file, schema &
metaschema– create and use stylesheets to transform
XML files– annotate & transcribe audio & video files – acquire ethics – What else??
July 11, 2003 E-MELD 2003
Workroom
Where user gets to work on her own data, using BP tools for:
•metadata creation (ORE)•terminology mapping•annotation & transcription•lexicon creation (FIELD)•What else?
July 11, 2003 E-MELD 2003
Reading Room
– Reference materials– Manuals – Links to off-site tutorials – White papers– Glossary of terms (linked
to other pages on the site)
• What else?
July 11, 2003 E-MELD 2003
Toolroom
Downloads of :•FIELD (Laptop version)•Standalone ORE•Links to LDC, IMDI tools, etc. for–Conversion– Annotation
•What else?
July 11, 2003 E-MELD 2003
The “School”
http://emeld.org/school