grammar-based treebank a happy marriage of empiricism and
TRANSCRIPT
![Page 1: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/1.jpg)
Grammar-based treebank– a happy marriage of empiricism and theory?
Alexandr Rosen
Institute of Theoretical and Computational LinguisticsFaculty of Arts, Charles University, Prague
Grammar and Corpora 20124th International Conference
Czech Academy of Sciences, Prague28–30 November 2012
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 1 / 84
![Page 2: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/2.jpg)
The bottom line (or two)
A corpus is an approximation of language use,a grammar is an approximation of language system.
→
The empirical and the theoretical sides of linguisticsmeet in the annotation of a corpus.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 2 / 84
![Page 3: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/3.jpg)
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 3 / 84
![Page 4: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/4.jpg)
Why treebanks, why grammars?
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 4 / 84
![Page 5: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/5.jpg)
Why treebanks, why grammars?
Why treebanks?
Treebank . . . a text corpus annotated (at least) with syntactic structure
= why corpora?
= why annotation?
= why syntax?
?= why grammars?
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 5 / 84
![Page 6: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/6.jpg)
Why treebanks, why grammars?
Why treebanks? (cont’d)
Explicit markup of syntactic relations (constituents, heads/dependents)
→
Easier to identify semantic relations (predicates and arguments)
Simplifies some queries
Simplifies extraction of lexical properties (valency)
Support for grammar development
Training data for NLP applications
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 6 / 84
![Page 7: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/7.jpg)
Why treebanks, why grammars?
Why grammars? 1/2
“Every time I fire a linguist, systemperformance goes up.”
Fred Jelinek, 1980s
But maybe we don’t care about systemperformance?
Moreover:No longer a wise strategy for NLPEmpirical and symbolic methods canbe combined‘Deep’ linguistics needed for long-termsuccess
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 7 / 84
![Page 8: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/8.jpg)
Why treebanks, why grammars?
Why grammars? 2/2
“We should probably all spendmore time on the linguisticannotation of actual data ratherthan on writing grammar rules,based primarily on introspection.”
Erhard Hinrichs, 1990s
But what kind of annotation?“A sentence has as many structures as there are theories.”[Haider(1993)]
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 8 / 84
![Page 9: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/9.jpg)
Treebanks
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 9 / 84
![Page 10: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/10.jpg)
Treebanks
Treebanks
First treebank: Lancaster-Leeds Treebankearly 1980s, 45 KW, later SUSANNE, due to Geoffrey Sampson
First major project: Penn Treebankrelease 0.5 in 1992, now 3 MW
Now according to Wiki: 74 treebanks in about 40 languages
The 11th International Workshop on Treebanks and LinguisticTheories starts today, approx. 20 contributions each year
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 10 / 84
![Page 11: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/11.jpg)
Treebanks
Treebanks differ in:
Size
Linguistic background
Format
Level of detail
Depth of analysis
Ways they are built
Also spoken, parallel, historical, ... treebanks
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 11 / 84
![Page 12: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/12.jpg)
Treebanks
Treebanks around the world *)
63 treebanks, 36 languages, sizes up to 1.5 billion words
Also spoken (8), historical (7), parallel (4)
Mostly stochastically parsed and manually corrected
15 parsed by a symbolic grammar (LFG, HPSG, DCG) andmanually disambiguated
39 PS-based annotation, 20 dependency-based annotation
15 available with multiple annotation formats – Penn Treebank :PS, P/A, dependency, LFG, HPSG, CCG, LTAG, PDT
20 with on-line search interface
*) The speaker’s time permitting!
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 12 / 84
![Page 13: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/13.jpg)
Treebanks
More examples of treebanks
Prague Dependency Treebank – Czech: 1.5 MWTiger – German: 0.9 MWLASSY – Dutch: 1500 MWLingo Redwoods – English: 45 KSBulTreeBank – Bulgarian: 250 KWINESS Treebanking Infrastructure – various: [Rosén et al.(2012)]Składnica – Polish Constituency Treebank : 8 KS...
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 13 / 84
![Page 14: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/14.jpg)
PDT – analytical layer
![Page 15: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/15.jpg)
PDT – tectogrammatical layer
![Page 16: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/16.jpg)
Tiger
![Page 17: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/17.jpg)
Old Church Slavonic (INESS)
![Page 18: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/18.jpg)
Polish (Składnica)
![Page 19: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/19.jpg)
Grammars
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 19 / 84
![Page 20: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/20.jpg)
Grammars
About grammars
Treebank grammars [Charniak & Charniak(1996)]
Probabilistic grammars directly projected from treebanks
“a paradigm shift from the manually constructed, a priori fixedlinguistic grammars” [Prescher et al.(2006)]
Annotation manuals
Symbolic (rule-based) grammars
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 20 / 84
![Page 21: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/21.jpg)
Grammars
The paradigm shift
Analytical, linguistic×empirical, data-driven
Analytical = analysis of linguistic competence
Poor coverage→ discontinue ‘deep’ processing?
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 21 / 84
![Page 22: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/22.jpg)
Grammars
Anyone need grammars? (Stephan Oepen, TLT2) 1/2
The Ultimate GrammarCoverage of arbitrary data, cross-domain and cross-genreAdequate grammatical analyses in all casesInclusion of semanticsFully declarativeSame grammar for both parsing and generationHigh-efficiency processing tools
BUT:No generally accepted linguistic theoryLong, tedious, error-prone engineering processFew experts
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 22 / 84
![Page 23: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/23.jpg)
Grammars
Anyone need grammars? (Stephan Oepen, TLT2) 2/2
The Final TreebankRepresentative data for ‘all’ of the language, domains, and genresFull annotation with (at least) syntactic and semantic informationUtterly coherentFree of errorsFully documentedFreely available
BUT:No generally accepted annotation standardLong, tedious, error-prone annotation process
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 23 / 84
![Page 24: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/24.jpg)
Grammars
The answer:grammars and treebanks should go together
Treebank annotation is where a grammar and a treebank canmeet
Treebank annotation is also where multiple theories can meet andcomplement each other
Grammar and treebank are like two sides of a coin:competence × performancesystem × uselangue × paroletheoretical × empirical
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 24 / 84
![Page 25: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/25.jpg)
The grammar–treebank relationship
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 25 / 84
![Page 26: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/26.jpg)
The grammar–treebank relationship
Treebank – grammar/theory relations
A treebank is useful ...As a source and testbed for grammar/theory development[Hajicová & Sgall(2006)]As training data for treebank grammars and other NLP tools
A grammar/theory is useful ...
To guide the design of an annotation schemeTo control annotation consistencyTo generate treebank annotations
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 26 / 84
![Page 27: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/27.jpg)
The grammar–treebank relationship
Linking lexicon and treebank
Theoretically motivated designStart: independently compiled list of entriesIncremental development
Examples:
PDT-VALLEX [Hajic et al.(2003)]FrameNet [Palmer et al.(2005)]PropBank [Baker et al.(1998)]TüBa-D/Z Valency Lexicon [Hinrichs & Telljohann(2009)]...
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 27 / 84
![Page 28: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/28.jpg)
The grammar–treebank relationship
Linking grammar and treebank
Grammar development should be supported by an annotatedcorpusAutomatic annotation by symbolic grammars requires a fullyadequate grammar, ideally based on a corpusVicious circle? A possible answer: Incremental development ofboth the grammar and the treebank
Examples:
LinGO Redwoods [Oepen et al.(2002)]Norgram [Rosén et al.(2006)]BulTreeBank [Simov et al.(2002)]Składnica [Swidzinski & Wolinski(2010)]...
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 28 / 84
![Page 29: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/29.jpg)
The grammar–treebank relationship
Rarely a single correct parse of a sentence
Symbolic grammars have limited access to context and worldknowledge
They produce many parses due to morphosyntactic and structuralambiguities
SolutionsStochastic disambiguation
Stochastic ranking
Manual selection, preferably interactive, based on discriminants
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 29 / 84
![Page 30: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/30.jpg)
The grammar–treebank relationship
Never 100% coverage
A parsed corpus generated by a symbolic grammar will neverreach 100% coverage of real-world data (LinGO: about 80%)
Reasons are fundamental: competence × performance
Some examples:anacolutcontaminationattractionzeugmasome cases of extraction
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 30 / 84
![Page 31: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/31.jpg)
The grammar–treebank relationship
Examples of suboptimal syntax
(1) Kdowho
prijdecomes
pozde,late
nicnothing
muhim
nedají.not-give
Who comes late won’t get anything. (intended)
(2) Vcerayesterday
jsemAUX
videlsaw
aand
mluvilspoke
swith
tímthat
clovekem.man
I saw and spoke to that person yesterday.
(3) Neboor
jáI
GazdaGazda
nevím,not-know
jakhow
diktuje.dictates
Or I don’t know how Gazda dictates. (int’d, due to Jan Klaška)
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 31 / 84
![Page 32: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/32.jpg)
The grammar–treebank relationship
Beyond grammar
How to find negative evidence in standard corpora?
Except for non-words not easy in a corpus of written language
Much of ‘suboptimal’ language use in spoken and learner corpora
Grammar useful to detect ungrammaticality
A treebank of suboptimal German [Kepser et al.(2004)]
Phenomena-oriented corpus [Oliva(2008)]
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 32 / 84
![Page 33: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/33.jpg)
The grammar–treebank relationship
Can we build a grammar-based treebank that includes real language?
Possible solutions?
A combination of stochastic + symbolic methods
Two grammars: positive and negative [Oliva & Petkevic(1998)]
Competence + performance grammar[Kempen & Harbusch(2001)]
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 33 / 84
![Page 34: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/34.jpg)
Czech treebanking
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 34 / 84
![Page 35: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/35.jpg)
Czech treebanking
The treebank of Czech
Prague Dependency Treebank
Dependency syntax, close to the Prague theory of FunctionalGenerative Description [Sgall et al.(1986)]
3 annotation levels: morphology, surface syntax, deep syntax
PDT 0.5 – 1998, 0.5 MW
PDT 1 – 2000, 1.5 MW
PDT 2 – 2004, deep syntax
PDT 2.5 – 2011, multi-word units, clause segmentation
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 35 / 84
![Page 36: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/36.jpg)
![Page 37: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/37.jpg)
Czech treebanking Time to scale up?
Time to scale up?
1.5 MW still too few for investigating less frequent forms andphenomena
Could offer more annotation formats
Could support inherent syntactic ambiguities
(4) Prineslbrought
bednubox
zefrom
sklepa.cellar
He brought a box from the cellar
(5) krajícslice
chlebabread
swith
máslembutter
a buttered slice of bread
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 37 / 84
![Page 38: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/38.jpg)
Czech treebanking Time to scale up?
A treebank for every taste
Theory-Supporting Treebank [Nivre(2003)]
Theory-neutral annotation contains too little information or toomany compromises to be really useful
Theory-specific may shut out people from other researchtraditions
Conversion? But the source annotation often lacks information tosupport a completely accurate conversion.
Possible conversions as a requirement in the design of treebankannotation schemes. Different kinds of (theory-specific)annotation should be supported by an underlying internalrepresentation.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 38 / 84
![Page 39: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/39.jpg)
Czech treebanking Time to scale up?
A treebank for every taste
Multi-Representational Treebank [Xia et al.(2009)]
Definitional differences between phrase structure anddependency structure: convertible if designed properly.
Preferential differences – the same in both: empty categories;labels to edges; ordered or unordered trees.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 39 / 84
![Page 40: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/40.jpg)
Czech treebanking Time to scale up?
Can a single core annotationbe viewed in different ways?
Theory-specific representations have different appearancesbut share a large part of content:constituency/dependency, morphosyntactic categories,even the spirit of analyses of many phenomena
A treebank offering different views of a sufficiently expressiveannotation scheme is a realistic goal
Additional benefit: relating linguistic theories
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 40 / 84
![Page 41: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/41.jpg)
Czech treebanking Time to scale up?
A larger treebank with customizable visualization?
Short-term goals:
Syntactic annotation of the Czech National Corpus(1.3 billion words) using a stochastic parser,followed by a rule-based correction module
Robust and expressive core annotation format, potentiallyunderspecified
Customizable query, visualization and export interface,offering multiple options to view syntactic structure
Accessible to lay users and satisfying experts at the same time
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 41 / 84
![Page 42: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/42.jpg)
Czech treebanking Time to scale up?
Long-term goals:
Development of a corpus-based grammar
Options for queries, visualization and export:ready-made, tailored to specific theories, ordefined by the user
Development of the correction module
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 42 / 84
![Page 43: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/43.jpg)
Czech treebanking Time to scale up?
The tasks of the grammar
Checking consistency
Adding more information on top of existing annotation
Assisting the treebank user
To help converting the data onto other formats more easily
To help distinguishing grammatical and suboptimal/ungrammaticalforms and structures
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 43 / 84
![Page 44: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/44.jpg)
Czech treebanking Time to scale up?
Grammar design and development
Constraint-based: all is possible except when stipulated otherwise
Hand-crafted but verified against the corpus data
Incremental development, based on conversion rules
Underspecification, partial parses to cope withsuboptimal/ungrammatical forms and constructions
Performance grammar as a mediator with the real-world language,similar to negative grammar?
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 44 / 84
![Page 45: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/45.jpg)
Architecture
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 45 / 84
![Page 46: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/46.jpg)
Architecture Syntactic structure
Syntactic structureInternal skeleton structures: constituency-based, with acombination of binary and flat branching
Interpretable as constituency or dependency trees, according tousers’ specification, visualized with an arbitrary amount of detail,not necessarily by tree graphs
Surface and deep structure encoded within a single structure:constituents are labelled as syntactic functionsincluding head as a special function
Heads are further specified as deep or surfaceDeep head: deep syntactic governor: bylo by se to povedloSurface head: can be identical to the deep head or different:auxiliary, prepositions, subordinate conjunctions, numerals
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 46 / 84
![Page 47: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/47.jpg)
PRED
šel
AUXVbyl
AUXVby
AUXPdo
ADV
lesaS
SHD
byDHD
SHD
bylDHD
HD
šelADVB
SHD
doDHD
lesa
![Page 48: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/48.jpg)
Architecture Syntactic structure
Three levelsWord order and syntactic structure as distinct dimensions, eachsentence is represented at three inter-linked levels:
graphemics (orthographic words, contractions)
morphology (syntactic words, including haplologized items)
syntax (trees, no nodes for pro-dropped subjects)
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 48 / 84
![Page 49: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/49.jpg)
Architecture Syntactic structure
Annotation of syntactic phenomenaAgreement of various types
Compound periphrastic verbal forms(passives, conditional structures, future...)
Grammatical co-reference(grammatical control, relative/reflexive pronouns, predicativecomplements)
Multi-word units (collocations)
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 49 / 84
![Page 50: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/50.jpg)
Architecture Syntactic structure
Expressive powerExpressive enough to accommodate analyses of arbitrarygranularityAmbiguous or undecidable phenomena represented byunderspecification and distributive disjunctionAnnotation of any kind can be missing, a sentence may be a merelist of words
SpecificationsAnnotation must be licensed by a formal grammar. Words andconstituents have their appropriate (potentially underspecified)sets of featuresLexicons are used to index forms, syntactic words and compoundformsCustomizable visualizations are enabled by formal definitions
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 50 / 84
![Page 51: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/51.jpg)
Architecture Syntactic structure
Links within a treeAgreementCompound (multi-word) verbal predicatesGrammatical coreference...
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 51 / 84
![Page 52: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/52.jpg)
Architecture Construction types and syntactic functions
Syntactic structureeach nonterminal node is assigned a construction type and asyntactic functioneach terminal node is assigned a syntactic function
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 52 / 84
![Page 53: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/53.jpg)
Architecture Construction types
Hierarchy of construction typesHeadedUnHeaded
Coord – coordinationAdord – adordinationUnspec – unspecified (for collocations and other)
Function for UnHeaded structures:Memb – a member
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 53 / 84
![Page 54: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/54.jpg)
Architecture Syntactic functions
Syntactic functions for HeadedSurfHead – surface head: auxiliary být/bývat, prepositions,subordinate conjunctions, numerals in quantified expressions: petdetíDeepHead – in case it differs from SurfHead (head nouns in PPs,autosemantic verbs in analytical predicates...)Head – both SurfHead and DeepHead
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 54 / 84
![Page 55: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/55.jpg)
Architecture Syntactic functions
Other syntactic functions for HeadedSubj – subjectAttr – attributeObj-Advb
ObjAdvb
VbAttr – predicative complementReflTant – reflexive element (si, se) for inherent reflexivesDeagent – deagentive reflexiveApos – appositionInDep – independent syntactic element (parenthesis, vocativesyntactic noun...)
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 55 / 84
![Page 56: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/56.jpg)
Examples
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 56 / 84
![Page 57: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/57.jpg)
Examples Contractions
Treating contractions
(6) Ty by ses byl ušpinil.you would REFL+AUX2nd ,sg bepple get dirtypple‘You would have got dirty.’
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 57 / 84
![Page 58: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/58.jpg)
Examples Contractions
Ty by ses byl ušpinil.
(7)
S
SUBJty
HEAD
SURFHEAD
by s
DEEPHEAD
SURFHEAD
bylDEEPHEAD
HEAD
ušpinilOBJse
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 58 / 84
![Page 59: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/59.jpg)
Examples Contractions
(8) Surface dependency structure derived from (7)
by+s
SUBJty
byl
ušpinil
OBJse
(9) Deep dependency structure derived from (7)
bys,byl,ušpinil
SUBJty
OBJse
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 59 / 84
![Page 60: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/60.jpg)
Examples Subject/object ambiguity
Subject/object ambiguity
Reflexive passive:
(10) ZarízeníNom/Gen se využívá.device REFL uses‘The device is being used.’
S
SUBJ
zarízeníHEAD
DEAGENT
seHEAD
využívá
S
HEAD
využíváDEAGENT
seOBJ
zarízení
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 60 / 84
![Page 61: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/61.jpg)
Examples Another type of subject/object ambiguity
Another type of subject/object ambiguity
(11) Zdravotnictví musí zachránit stát.health servicenom/acc must save statenom/acc
Two different readings:
#1 Health service must save the State.#2 Health service must be saved by the government.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 61 / 84
![Page 62: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/62.jpg)
Examples Another type of subject/object ambiguity
S
SUBJ
zdravotnictvíHEAD
HEAD
musíOBJ
HEAD
zachránitOBJ
státS
SUBJ
státHEAD
HEAD
musíOBJ
HEAD
zachránitOBJ
zdravotnictví
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 62 / 84
![Page 63: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/63.jpg)
Input processing
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 63 / 84
![Page 64: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/64.jpg)
Input processing
Processing of the input text:
Automatic correction of the output of a stochastic parser
Conversion of the corrected parse + modifications:phenomena that require arbitrary decisions in a dependency tree:constructions with function words, coordinated constructions, listsdisjunction accounting for structural ambiguities expressed byPDT’s “combined functions” AttrAdv, ObjAdv
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 64 / 84
![Page 65: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/65.jpg)
Input processing
Syntactic tree in the PDT and the new format
(12) Most,Bridge
kterýwhich
bylwas
vin
havarijnímemergency
stavu,state
byshould
melhavemodal
sloužitserve
dalšíchnext
tricetthirty
let.years.
‘The bridge, which was ramshackle, should serve for anotherthirty years.’
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 65 / 84
![Page 66: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/66.jpg)
![Page 67: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/67.jpg)
![Page 68: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/68.jpg)
![Page 69: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/69.jpg)
Input processing Correction module
Correction module
30 correction rules so far
For more frequent errors which can be reliably corrected
Such as noun in accusative as subject
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 69 / 84
![Page 70: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/70.jpg)
Input processing Correction module
Success rate of the correction modules
Rules Dependency Label TotalClauses 6 1688 774 1744NP 8 819 2066 2625PP 9 834 7160 7722Other 5 412 1390 1802Total (ppm) 3753 11390 13893Total (%) 0.38% 1.14% 1.39%
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 70 / 84
![Page 71: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/71.jpg)
Conclusions and plans
Outline of the talk
1 Why treebanks, why grammars?
2 Treebanks
3 Grammars
4 The grammar–treebank relationship
5 Czech treebanking
6 Architecture
7 Examples
8 Input processing
9 Conclusions and plans
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 71 / 84
![Page 72: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/72.jpg)
Conclusions and plans
Conclusions and plans 1/2
ResultsConversion rulesCorrection module200M corpus parsed and correctedBeta version of a viewer with three representation modes
Further workManually tagged and parsed subcorpus will provide better data totrain the parserMore parsing errors will be detected and correctedMore modes of viewing the syntactic structureGrammar development
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 72 / 84
![Page 73: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/73.jpg)
Conclusions and plans
Conclusions and plans 2/2
Empiricism and theory meet in the corpus annotation
Competence grammar to fully license the annotation ofgrammatical forms and constructions
Underspecification and partial parses for the rest
Performance grammar to close the gap between the real languageand the annotation provided by the competence grammar
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 73 / 84
![Page 74: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/74.jpg)
Conclusions and plans
Based on the work of:
Milena Hnátková, Petr Jäger,Tomáš Jelínek, Vladimír Petkevic,
Hana Skoumalová and myself
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 74 / 84
![Page 75: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/75.jpg)
Conclusions and plans
Supported by:
The Grant Agency of the Czech Republic
Project no. GACR P406/10/0434
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 75 / 84
![Page 76: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/76.jpg)
Conclusions and plans
S
VP
V
Thank
NP
you
PP
P
for
NP
Det
your
N
attention!Dekuji
vám za
pozornost!
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 76 / 84
![Page 77: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/77.jpg)
Conclusions and plans
References I
Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998).The Berkeley FrameNet project.In 36th Meeting of the Association for Computational Linguisticsand 17th International Conference on Computational Linguistics(COLING-ACL’98), pages 86–90, Montréal.
Charniak, E. & Charniak, E. (1996).Tree-bank grammars.In In Proceedings of the Thirteenth National Conference onArtificial Intelligence, pages 1031–1036.
Haider, H. (1993).Deutsche Syntax – Generativ.Narr, Tübingen.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 77 / 84
![Page 78: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/78.jpg)
Conclusions and plans
References II
Hajic, J., Panevová, J., Urešová, Z., Bémová, A., & Pajas, P.(2003).PDT-VALLEX: Creating a large-coverage valency lexicon fortreebank annotation.In Proceedings of The Second Workshop on Treebanks andLinguistic Theories, pages 57–68. Växjö University Press.
Hajicová, E. & Sgall, P. (2006).Corpus annotation as a test of a linguistic theory.In Proceedings of LREC 2006, pages 879–884.
Hinrichs, E. W. & Telljohann, H. (2009).Constructing a valence lexicon for a treebank of German.In Proceedings of the Seventh International Workshop onTreebanks and Linguistic Theories, page 41–52.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 78 / 84
![Page 79: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/79.jpg)
Conclusions and plans
References III
Kempen, G. & Harbusch, K. (2001).Performance grammar: a declarative definition.In M. Theune, A. Nijholt, and H. Hondorp, editors, CLIN, volume 45of Language and Computers – Studies in Practical Linguistics,pages 148–162. Rodopi.
Kepser, S., Steiner, I., & Sternefeld, W. (2004).Annotating and querying a treebank of suboptimal structures.In In Proceedings of the 3rd Workshop on Treebanks andLinguistic Theories (TLT2004), pages 63–74.
Nivre, J. (2003).Theory-supporting treebanks.In Proceedings of the Second Workshop on Treebanks andLinguistic Theories.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 79 / 84
![Page 80: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/80.jpg)
Conclusions and plans
References IV
Oepen, S., Flickinger, D., Toutanova, K., & Manning, C. D. (2002).LinGO Redwoods: A rich and dynamic treebank for HPSG.In Proceedings of the Workshop on Treebanks and LinguisticTheories, September 20-21 (TLT02), Sozopol, Bulgaria.
Oliva, K. (2008).Phenomena-oriented corpora: a manifesto.In F. Štícha and M. Fried, editors, Grammar & Corpora =Gramatika a korpus 2007. Sborník príspevku ze stejnojmennékonference 25.-27. 9. 2007, Liblice= Selected contributions fromthe conference Grammar and Corpora, Sept. 25-27, 2007, Liblice,pages 77–104, Praha. Academia.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 80 / 84
![Page 81: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/81.jpg)
Conclusions and plans
References V
Oliva, K. & Petkevic, V. (1998).Phenomena-based description of dependency-syntax: A survey ofideas and formalization.In E. Hajicová and B. Hladká, editors, Issues of Valency andMeaning – Studies in Honour of Jarmila Panevová. CharlesUniversity Press, Praha.
Palmer, M., Gildea, D., & Kingsbury, P. (2005).The proposition bank: An annotated corpus of semantic roles.Computational Linguistics, 31(1), 71–106.
Prescher, D., Scha, R., Sima’an, K., & Zollmann, A. (2006).What are treebank grammars?In BNAIC’06: BeNeLux conference on Artificial Intelligence 2006,Namur, Belgium.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 81 / 84
![Page 82: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/82.jpg)
Conclusions and plans
References VI
Rosén, V., de Smedt, K., & Meurer, P. (2006).Towards a toolkit linking treebanking to grammar development.In Proceedings of the 5th International Workshop on Treebanksand Linguistic Theories (TLT’05), Prague, Czech Republic.
Rosén, V., Smedt, K. D., Meurer, P., & Dyvik, H. (2012).An open infrastructure for advanced treebanking.In J. Hajic, K. D. Smedt, M. Tadic, and A. Branco, editors,Proceedings of the META-RESEARCH Workshop on AdvancedTreebanking, LREC 2012, pages 22–29, Istanbul, Turkey. ELRA,European Language Resources Association.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 82 / 84
![Page 83: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/83.jpg)
Conclusions and plans
References VII
Sgall, P., Hajicová, E., & Panevová, J. (1986).The Meaning of the Sentence in its Semantic and PragmaticAspects.Reidel and Academia, Dordrecht and Praha.Editor: Jacob Mey.
Simov, K., Osenova, P., Kolkovska, S., Balabanova, E., Doikoff, D.,Ivanova, K., & Alexander Simov, M. K. (2002).Building a linguistically interpreted corpus of Bulgarian: theBulTreeBank.In Proceedings of LREC 2002, pages 1729–1736, Canary Islands,Spain.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 83 / 84
![Page 84: Grammar-based treebank a happy marriage of empiricism and](https://reader031.vdocument.in/reader031/viewer/2022020916/61b4ff455038f146d018a275/html5/thumbnails/84.jpg)
Conclusions and plans
References VIII
Xia, F., Rambow, O., Bhatt, R., Palmer, M., & Sharma, D. M.(2009).Towards a multi-representational treebank.In F. Van Eynde, A. Frank, G. van Noord, and K. De Smedt, editors,Proceedings of the 7th International Workshop on Treebanks andLinguistic Theories (TLT7), pages 127–133, Utrecht. LOT.
Swidzinski, M. & Wolinski, M. (2010).Towards a bank of constituent parse trees for Polish.In Proceedings of the 13th International Conference on Text,Speech and Dialogue, TSD’10, pages 197–204, Berlin,Heidelberg. Springer-Verlag.
A. Rosen (CU Prague) Grammar-based treebank: empiricism/theory Grammar & Corpora 2012 84 / 84