enel wg3 meeting: automatic knowledge acquisition for lexicography herstmonceux, august 2015 starts...

11
ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Upload: sharlene-mckenzie

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

ENeL WG3 meeting:Automatic Knowledge

Acquisition for LexicographyHerstmonceux, August 2015

STARTS AT 2:30 PM

Page 2: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Agenda

• Welcome• Appointment of minutes secretary

• Presentations• Practicalities• Minutes of the Bolzano (July 2014) meeting• MWE Survey & joint ENeL/Parseme workshop• Following meeting & future meetings• Training school 2016

• AOB

Page 3: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

MWE Survey

• https://docs.google.com/forms/d/1eDVIxClSJO_6NOe3Jmi8cto_g__eaG2G3KPJJXfZszk/viewform • This is a joint survey of the ENeL and PARSEME Cost Actions. The aim is

to identify dictionaries and lexical resources which contain Multiword Expressions (MWEs) and how these MWEs are represented in those dictionaries and lexical resources. • PARSEME members can benefit from additional data sources for their

research, and ENeL members can benefit from the expertise to process their resources and carry out new research.• The results of this survey will help us to prepare a joint workshop of

ENeL and PARSEME, which is planned for April 2016.

Page 4: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Results – resources

• Name of the resource containing MWEs:• {hr,sl,sr}MWELex• Kamusi GOLD• TERMIS: a database of Slovene PR terms• Estonian Collocation Dictionary• Dicionario bilingüe castellano-gallego de la Real Academia Gallega• OENOLEX - Professional dictionary of wine tasting• Multiword Expressions in Czech• The Danish Dictionary (Den Danske Ordbog)• Algemeen Nederlands Woordenboek (ANW)• Slovene Lexical Database

Page 5: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Results – other questions

• Languages• Croatian/Serbian/Slovene, multilingual, Slovene, Estonian, Spanish and

Galician, French, Czech, Danish, Dutch (including regional variation: Netherlands, Belgium, Suriname)

• Availability of the resource• restricted (6), unrestricted (4)

• Availability of the resource for the workshop• yes (8), no (2)

• Participation at the workshop• yes (8), no (2)

Page 6: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

The workshop: info

• "PARSEME/ENeL workshop on MWE e-lexicons"• http://typo.uni-konstanz.de/parseme/index.php/2-general/135-enel-parseme

-workshop-on-mwe-lexicons

• 5-6 April 2016 (co-located with PARSEME's 6th general meeting)• University of Skopje, Faculty of Computer Science and Engineering

(FCSE), Skopje, FYR Macedonia• Organizers: Simon Krek & Carole Tiberius (ENeL), Carla Parra Escartín

& Manfred Sailer (PARSEME)• Local Organizer: Katerina Zdravkova• Participants: 20 experts - 10 from ENeL and 10 from PARSEME

Page 7: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

The workshop: Parseme

• PARSEME (PARSing and Multi-word Expressions)• Towards linguistic precision and computational efficiency in natural

language processing• WG 1: Lexicon-Grammar Interface• WG 2: Parsing Techniques for MWEs• WG 3: Statistical, Hybrid and Multilingual Processing of MWEs• WG 4: Annotating MWEs in Treebanks

Page 8: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

The workshop: aims

• MWEs are defined as sequences of words with some unpredictable properties such as "to count somebody in" or "to take a haircut". In lexicographic context, they are typically described as idioms, phraseology, phrasal verbs and similar elements, as parts of dictionary entries.• The aim of the workshop is to combine knowledge about (individual)

lexical resources from ENeL members with the expertise in NLP present in Parseme to better understand the nature of MWEs as defined by the lexicographic and NLP communities to:• enhance MWEs extraction techniques for lexicographic purposes• encourage the use of lexical resources with MWEs in NLP

Page 9: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Following meeting & future meetings• Following meeting

• Online Dictionaries and their Users (WG1 & WG3)• March 2016 (dates decided by SG)• Spain: University of Vigo • Organisers: Robert Lew (WG1), Carole Tiberius• Local organisers: Carlos Valcárcel Riveiro & María José Domínguez Vázquez

• Future meetings (Brno, Czech Republic, September 12-16, 2016)• The use of lexicographical data in computational linguistics – investigation of

possible use of dictionary content for computational linguistic applications • Between Corpora and Dictionaries – analysis of the interface between dictionaries

and computational lexica and (syntactically and semantically annotated) corpora

Page 10: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

Training school

• 2015: WG2 • Standard tools and methods for retro-digitising dictionaries• 6 ‒ 10 July 2015, Lisbon, Portugal• 5 trainers• 29 trainees

• 2016: WG3• Title and detailed programme to be defined• Trainers – together with the programme (before the end of 2015)• Dates: 17th ‒ 20th May 2016 (pre-LREC 2016 in Portorož, Slovenia)• Location: Ljubljana, Slovenia (University of Ljubljana)

Page 11: ENeL WG3 meeting: Automatic Knowledge Acquisition for Lexicography Herstmonceux, August 2015 STARTS AT 2:30 PM

AOB

• Tomorrow (14:00-15:30) Use cases of all WGs combined• ?