alannah fitzgerald shaoqun wu martin barge saima sherazi ...investigating an open methodology for...
TRANSCRIPT
Investigating an open methodology for designing domain-specific language
collections
Alannah Fitzgerald Shaoqun Wu Martin Barge
Saima Sherazi William Tweddle
http://opendesignnow.org/index.php/visual_index/events/ /
EuroCALL 2014
• Development of Tools and Language Corpora – Design-‐Based Research with the FLAX Project
• Openness in Corpus-‐Based Tools, Resources & PracIces
• Research with Corpus-‐Based Approaches – CriIcal ReflecIon by Language Teachers with Open Do-‐It-‐Yourself ESAP Language CollecIons
WHO ARE WE IN THIS FLAX RESEARCH & DEVELOPMENT COLLABORATION?
FLAX Language at Waikato University
h8p://flax.nzdl.org FLAX image by permission of non-‐commercial reuse by Jane Galloway
FLAX Language Project at the Greenstone Digital Library Lab,
Waikato University NZ
Professor Ian WiVen FLAX Project Lead
Dr Shaoqun Wu FLAX Project Lead Researcher & Developer
Research on Open Corpora with FLAX
hVp://oerresearchhub.org/
OER Research Hypotheses
hVp://oerresearchhub.org/collaboraIve-‐research/hypotheses/
Research with Queen Mary U. of London
hVp://language-‐centre.sllf.qmul.ac.uk/home
Be Free to Do Whatever You Want! • Open Resources for ESAP
Soup Dragons: – Building & Sharing Open ESAP
Corpora to Promote DIY Corpus-‐Based Approaches
– Developing Automated InteracIvity into ESAP Corpora
– Developing ESAP Course Book and Lesson Plan DerivaIves
– Researching and Developing ESAP Corpora & DerivaIves
– Researching and Developing Corpus Tools e.g. Interfaces
http://en.wikipedia.org/wiki/The_Soup_Dragons
OPEN SOURCE LANGUAGE TOOLS DEVELOPMENT
FLAX Digital Library
Collections
Collocations database
Glossary
Open Educational Resources
Google-‐esque Interface Designs
Designed for the non-‐expert corpus user, namely: learners, teachers, subject academics, instrucIonal designers and language resource developers.
Introducing the Wikipedia Miner Toolkit (Milne & WiVen, 2013)
Building InteracIvity into FLAX Language CollecIons
FLAX AcIviIes ConInued
FLAX Across Placorms • FLAX Website flax.nzdl.org for hosIng open online language collecIons • Building directly onto the Web with OER
• FLAX mulIlingual open-‐source soeware for download • Set up your own FLAX server online or; • Build collecIons offline for use on your PC
• Using All Rights Reserved content • FLAX for MOODLE plug-‐in • FLAX for MOOC Placorms? • FLAX in conjuncIon with translaIon technologies?
Training Videos for FLAX on YouTube
h8ps://www.youtube.com/user/bananakiwiful/videos
DOMAIN-‐SPECIFIC OPEN LANGUAGE COLLECTIONS BUILDING
CollaboraIon with Subject Specialists
“In the emerging academic literacies approach involving cooperaIon between subject specialists and wriIng teachers, the aim is to help the students develop metacogniIve awareness of the roles and funcIons of wriIng in that discipline, to enable them to stand back from it and observe how it funcIons, and then to help them gradually parIcipate in the genres, where genre is understood as a constellaIon of acIons rather than a list of formal features.” (Breeze, 2012)
Domain-‐specific CollocaIons
We focus on lexical collocaIons with noun-‐based structures because they are the most salient and important paVerns in topic-‐specific text. CollocaIons from the English Common Law MOOC: • verb + noun e.g. abolish judicial review • noun + noun e.g. precedent case • adjecIve + noun e.g. common law • noun + of + noun e.g. court of appeal
Lexical Bundles “Lexical bundles” are mulI-‐word sequences with disIncIve syntacIc paVerns and discourse funcIons that are commonly used in academic prose (Biber & Barbieri, 2007; Biber et al, 2003, 2004). Bundles from OpenYale Environmental PoliIcs and Law: • noun phrase + of e.g. So the idea of • preposiIonal phrase + of e.g. on the part of • it + verb/adjecIve phrase e.g. it’s going to be • be + noun/adjecIve phrase e.g. is an example of • verb phrase + that e.g. of the way that
Law CollecIons in FLAX Type of media in the FLAX Law CollecXons
Number and source of items in the FLAX Law CollecXons
Podcast audio files & transcripts (OpenSpires -‐ OER)
10-‐15 Lectures (Oxford Law Faculty & the Centre for Socio-‐Legal Studies)
MOOC lecture transcripts & videos (streamed via YouTube & Vimeo -‐ OER)
4 MOOC CollecIons: Copyright Law (Harvard/edX), English Common Law (Uni. of London/Coursera), Age of GlobalizaIon (Texas at AusIn/edX), Environmental Law & PoliIcs (OpenYale)
PhD Law theses (Open Access) BriIsh Law Report Corpus (BLaRC) by Marin, 2012 (Open Access)
50 EThoS Theses at the BriIsh Library (Abstracts, IntroducIons, Conclusions) 8-‐million word corpus derived from freely available legal content on the BAILII website
Research ArIcles (Open Access) 40 ArIcles (DOAJ -‐ Directory of Open Access Journals)
Working with Full Texts
CollocaIons Within ESAP CollecIons
Linking to the FLAX Learning CollocaIons CollecIon (BNC, BAWE, Wikipedia)
Good Ol’ Part-‐Of-‐Speech Tagging
Wikify Your CollecIons
Lexical Bundles
FLAX HTML Formasng Tool
RESEARCHING RESOURCES AT THE INTERFACE OF OPENNESS FOR ACADEMIC ENGLISH
Key Data Sets Will Consist Of: • Data for evaluaIon of collecIons and classroom teaching derivaIves of the collecIons for ESAP – Survey and Think-‐Aloud Protocols to evaluate the FLAX Language System
– Timed-‐wriIng exam preparaIon (Queen Mary University of London).
• Interview and focus-‐group data (f2f and online via Skype) – With stakeholders (language teachers, academics, MOOC providers) involved in the development of the academic language collecIons used in this research.
References • Biber, D., Conrad, S., & Cortes, V. (2003). Lexical bundles in speech
and wriIng: an iniIal taxonomy. In A. Wilson, P. Rayson, & T. McEnery (Eds.), Corpus linguis=cs by the lune: A festschrie for Geoffrey Leech (pp. 71–92). Frankfurt/Main: Peter Lang.
• Biber, D., Conrad, S., & Cortes, V. (2004). If you look at . . .: lexical bundles in university teaching and textbooks. Applied Linguis=cs, 25, 371–405. Biber, D. (2006). University Language, A corpus-‐based study of spoken and wriFen registers. John Benjamins, Amsterdam.
• Biber, D., Barbieri F. (2007). Lexical bundles in university spoken and wriVen registers. English for Specific Purpose, 26, 263–286.
• Breeze, R. (2012). Rethinking Academic Wri=ng Pedagogy for the European University. Rodopi, Amsterdam.
• Milne, D. & WiVen, I.H. (2013). An open-‐source toolkit for mining Wikipedia. Ar=ficial Intelligence, 194, 222-‐239.
Thank You FLAX Language Project hVp://flax.nzdl.org/greenstone3/flax?a=fp&sa=library Shaoqun Wu: [email protected] / Ian WiVen: [email protected]
OER Research Hub hVp://oerresearchhub.org/
Alannah Fitzgerald: [email protected]; @AlannahFitz; www.alannahfitzgerald.org TOETOE Blog; Slideshare:
hVp://www.slideshare.net/AlannahOpenEd/
The Language Centre – Queen Mary University of London hVp://language-‐centre.sllf.qmul.ac.uk/ MarIn Barge [email protected]
William Tweddle [email protected] Saima Sherazi [email protected]