the blark matrix and its relation to the language resources situation for the celtic languages
DESCRIPTION
The BLARK Matrix and its relation to the language resources situation for the Celtic languages. Delyth Prys Language Technologies Unit, Canolfan Bedwyr University of Wales, Bangor. The target audience of the BLARK is: researchers (both in academia and in industry), and educators - PowerPoint PPT PresentationTRANSCRIPT
The BLARK Matrix and its relation to the language
resources situation for the Celtic languages
Delyth PrysLanguage Technologies Unit,
Canolfan BedwyrUniversity of Wales, Bangor
The target audience of the BLARK is:
• researchers (both in academia and in industry), and educators
• policy makers and organisations
Welsh Language Technologies1992 Welsh Language Act1993 Terminology Spelling and
Centre Grammar Checker1998 Welsh Assembly Government2000 NAACLT Conference, Limerick2001 e-Welsh Unit founded2003 + Speech Technology
+ Place-name Centre2006 Welsh Language Board IT Strategy2006 e-Welsh>Language Technologies Unit
LR Activities• Terminology Standardization
– 23 projects• Lexical projects
– English/Welsh, Irish/Welsh, digitization• Language tools
– spelling, grammar checkers, hyphenators• Place-name archives
– historical, contemporary• Speech Technology
– speech processing resources for Welsh and Irish• Computer-assisted language learning
– CDs and web-based e.g. BBCs’ LearnWelsh web-site
[Please imagine knight in shining armour galloping across screen carrying BLARK banner]
Celtic Community• Common linguistic heritage• Common socio-political situation• Some common pan-Celtic institutions• Grant possibilities• Celtic is sexy!
Celtic overviewIrish Scots
GaelicManx Welsh Breton Cornish
Official Status xxx xx xx xx x x
No. of users xx xx x xxx xx xPrognosis xx xx xx xxx xx xxAcademic community xxx xx x xxx xx x
States & LanguagesIrish Scots
GaelicManx Welsh Breto
nCornish
Member State Ireland
/UKUK Non
EUUK Fra-
nceUK
Bilingual with E E E E F ELang-uage packs
C + V V V C + V
V V
BLARK Recipe• Applications (e.g. CALL, access
control)• Modules (e.g. morphological analysis,
speech synthesis)• Language data (e.g. data sets,
descriptors)
Available Written DataIrish Scots
GaelicManx Welsh Breton Cornish
Religious xxx xxx xxx xxx xxx xxLegal xxx xx xx xx x xBilingual xxx xxx xxx xxx xxx xxxNewspapers x x x x x x
Available Spoken DataIrish Scots
GaelicManx Welsh Breton Cornish
Religious xx xxx x xxx x xBroadcast xxx xx x xxx x xOther formal xxx xx xx xxx xx xxFree dialogue xxx xx x xxx xx x
Conclusions• BLARK not only good but desirable• Develop Pre-Blark in order to be
realistic• Also look for other solutions in data-
poor environments• Archive and store all possible data• Educate and train• Mobilise wider academic community• Lobby policy makers