improved parser for simple croatian sentences kristina vučković, božo bekavac, zdravko dovedan...

12
Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences {kvuckovi, bbekavac,zdovedan}@ffzg.hr NooJ2010 Komotini, Greece 2010-05-28

Upload: isabella-watson

Post on 23-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

Improved Parser forSimple Croatian Sentences

Kristina Vučković, Božo Bekavac, Zdravko Dovedan

University of Zagreb, Faculty of Humanities and Social Sciences

{kvuckovi, bbekavac,zdovedan}@ffzg.hr

NooJ2010 Komotini, Greece

2010-05-28

Page 2: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

2/12

Problems if there is a nominal

predicate

if the <NP> node is a coordination made of 2 or more <NP>’s of different gender

ex. The child was very smart.

-> The child was

->was very smart.

ex. The boy and a girl

were very smart.

->The boy and a girl =><NP+p+m>

Last Year’sSolved

Page 3: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

3/12

Nominal Predicate simple <NP>

single noun He is a boy.

complex <NP> single noun + any number of adjectives,

pronouns, numerals proceeding and agreeing with the noun in number, gender and case He is my young friend.

single pronoun He is mine.

single adjective He is young.

single numeral He is the first.

adverb He is great.

redefined <NP><(PRO|A|M)*N><(PRO|A|M)*PRO><(PRO|A|M)*A><(PRO|A)*M>

Page 4: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

4/12

Page 5: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

Examples Pravda i opraštanje su

dva temeljna stupa mira. Justice and forgiveness

are two main pillars of peace.

Cijene su izuetno povoljne. Prices are extremely

affordable.

To je izvrsno. This is great.

NooJ2010Komotini5/12

Page 6: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

Measures and Explanations

NooJ2010Komotini6/12

<to be,VP> <NP+Nom>=Adjective same form as <VP+Passive>Knjiga je napisana… The book is written…Napisana knjiga je … The written book is…

Precision:

Recall:

72,72

69,56

Page 7: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

7/12

Coordination of multiple <NP> nodes

1. all <NP> are in the same gender <NP+f<NP+fjabuka>, <NP+fkruška> i <NP+fšljiva>>

an apple, a pear and a plum2. combination of masculine and feminine <NP> <NP+m<NP+fjabuka>, <NP+fkruška> i

<NP+mananas>> an apple, a pear and a pineapple

3. combination of masculine and neutral <NP> <NP+m<NP+mananas> i <NP+ndijete>>

a pineapple and a child4. combination of neutral and feminine <NP> <NP+m<NP+fjabuka> i <NP+ndijete>>

an apple and a child

Page 8: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

8/12

Coordination of two or more proper nouns <NP+mZagreb> i <NP+mDubrovnik> <NP+mZagreb> i <NP+fBarcelona>

<NP+mTin Ujdur> i <NP+mFilip Kocijan>

<NP+fEma Ujdur> i <NP+mFilip Kocijan>

<NP+mUjdur> i <NP+mKocijan> <NP+fUjdur> i <NP+fKocijan> <NP+fUjdur> i <NP+mKocijan>

<NP+m ><NP+m >

<NP+m ><NP+m >

Last Names only

<NP+m > su otišli

<NP+f > su otišle

<NP+m > su otišli

Page 9: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

9/12

Page 10: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

NooJ2010Komotini

10/12

Example of concordance

Page 11: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

Measures

NooJ2010Komotini11/12

Precision:

Recall:

100

90

Page 12: Improved Parser for Simple Croatian Sentences Kristina Vučković, Božo Bekavac, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences

Thank you for your attention.

The research within the project ACCURAT leading to these results has received funding from the

European Union Seventh Framework Programme

(FP7/2007-2013), grant agreement no 248347.

www.accurat-project.eu