towards the creation of a belarusian grammatical dictionary
DESCRIPTION
TOWARDS THE CREATION OF A BELARUSIAN GRAMMATICAL DICTIONARY. Igor V . Shevchenko ( ULIF NANU , Kyiv ) Natalia Kotsyba (ISPAN, Warsaw) Kiryl Kurshuk (Hrodna University). Grammatical dictionarie s: applications. p rovide description of the word - declination and word - formation - PowerPoint PPT PresentationTRANSCRIPT
TOWARDS THE CREATION OF A BELARUSIAN GRAMMATICAL
DICTIONARY
Igor V. Shevchenko (ULIF NANU, Kyiv) Natalia Kotsyba (ISPAN, Warsaw)Kiryl Kurshuk (Hrodna University)
Grammatical dictionaries: applications
• provide description of the word-declination and word-formation
• enable lemmatization• morphological analysis and synthesis• grammatical tagging of text corpora• can be integrated into other dictionaries:
explanatory, synonimic, etc. (“Словники України”)
Grammatical dictionaries: history
• prototype: Grammatical Dictionary of Russian by Andrey Zaliznyak (1967)
• Grammatical dictionary of the Ukrainian language (UGD) by Igor Shevchenko, 2456 word-inflexion grammatical classes (WIC)
• Grammatical dictionaries for Polish, German, English, etc., in ULIF NASU
Grammatical dictionaries: features
• contain information about all the forms of inflected words of a language and their grammatical features
• uniformity of information is provided by WIC or• WIC is a set of words with the same type of
word-inflexion• WIC is determined by a combination of the
parameter values• words belonging to the same WIC differ in their
invariable parts only
POS-independent value parameters
• part of speech (or WI-generalization, WI-type) masculine nouns, feminine nouns
• type of word stem • conjugation pattern• type of consonant-vowel changes• paradigm incompleteness• non-typical features of wordforms in certain
grammatical meanings • type of the accent distribution in the word-
inflexion paradigm.
POS-dependent value parametersFor verbs: • aspect , reflexivity , imperative form, passive
participle suffixFor nouns: • gender, animacy, genitive for masculine nouns,
locative for masculine and neutral nouns, dative for masculine nouns, accusative case in plural
• etc.
WIC examples: #1540 and #382Lexeme POS declension basis change anim genitiv WIC
вbборець
видавtць
іспfнець
промислjве
ць
n 2dec soft -е person а 1540
Lexeme POS conjugation basis final
suffix
aspect reflex change WIC
компенсувfти
ліквідувfти
нормалізувfт
и
v 1dec iota -ати imperf
+ perf
– – 382
WIC examples: #2335 and #1145
Lexeme POS declension basis change animal
coord
pecul WIC
Вітfліїв
змsїв
нeнціїв
a possessive hard ї-є + – 2335
всzкий
дtкотрий
жjдний
pron general hard – + – 1145
Word-inflexion parameter: scope• word-inflexion parameter can be regarded as a
discrete function with a limited range of possible values (the value area)
• parameter "type of the word stem" can get one of 5 values: hard, soft, combined, iotized and r-type.
• parameter ”gender” has potentially 10 different values for a lexeme (three genders, six their combinations by the order of two and, besides, one combination of all three genders)
• genitive for masculine nouns has three values: -a (or -я, depending on the word ending), -у (-ю), or both -а (-я) and -у (-ю) are possible.
GD for Belarusian
• Dictionary of the Belarusian Language. Spelling. Orthoepy. Accentuation. Word-Inflection. (Слоўнік беларускай мовы. Арфаграфія. Арфаэпія. Акцэнтуація. Словазмяненне), Мінск, 1987
• ca. 100000 words, including material from:• “The Belarusian-Russian Dictionary” (Moscow, 1962)• Explanatory Dictionary of the Belarusian Language
(“Тлумачальны слоўнік беларускай мовы”, vol. 1-5, 1977-1984)
• materials of the lexical card file of the Yakub Kolas Institute of Linguistics
Examples of word entries
• абвязfць зак., абвяжe, абвzжаш, -жа, -жам, -жаце, -жуць
• кацsць незак., качe, кjціш, кjціць, кjцім, кjціце, кjцяць
• вы#ган с.-г., -ну, -не, -наў; (дзеянне) -ну, -не
• насtнне [ньне] -нні• свякрeха -усе, -ух
Grammatical options and the accent
• друкfрня -рні, -рань і -рняў• мадэ#ль -ллю [льлю], -лей і –ляў
The scope of paradigm presentation varies depending on the rolling accent:
• кнот кнjта, -оце, мн. кнаты#, -тjў• лес лtсу, лtсе, мн. лясы#, лясjў• сват свfта, свfце, мн. сваты#, -тjў• склеп -па, -пе, мн. скляпы#, -пjў
Regular errors
• у# (under stress) > ў
абагачу > абагачў (“enrich”)• о# (under stress) > б
аблокі > аблбкі (“clouds”)
However: адбракаваны (“rejected”). We can’t change automatically combinations of some consonant letters, like дбр into дор.
• e# (under stress) > ё
смецце > смёцце (“rubbish”)• ы > ьі
Substitution of some affixes
• -цель, -чык, -шчык > -льнік, -нік, -ц-а, -ец, -ар.
збавіцель > збаўца (“rescuer”)
натхніцель > натхняльнік (“inspirer”)
выхавацель > выхавальнік (“educator”)• -учы (-ючы) > ц-а
выступаючы > выступоўца (“speaker”)• -енн > -ав
дарэформенны > дарэформавы (“pre-reform”)
Parallelism in the word-inflexion systems of Ukrainian and Belarusian
• nouns of neutral gender on -нне: “стаzнне” (“standing”), “абогатварэ#нне” (“idolizing”), “абагравfнне” (“playing up”) = Ukrainian WIC #2108 , -ння: “стоzння” (“standing”), “малювfння” (“drawing”)
• -сць: “легfльнасць” („legality”) „легкавfжнасць” (“light-mindedness”) = Ukrainian WIC #2143, 3rd declension with the change o-i in some cases: “актbвність” (“activity”), “раптjвість” (“suddenness”)
• adjectives with the -ы “бtлы” (“white”), “агeльны” (“general”) = Ukrainian WIC #2302 “бsлий” (“white”), “спsльний” (“common”), hard consonant stem
• verb ending with -аць “дбаць” (“take care”), “спаць” (“sleep”) = Ukrainian WIC #697 “дбfти” (“take care”), “спfти” (“sleep”), i.e. verbs of the 1st conjugation with iotized endings in present tense and without passive participle in the paradigm
• the infinitive form ending -ацца “абагашчfцца” (“get rich”), “зжывfцца” (“get used”) = WIC #700 “вигинzтися” (“bend”), “зупинzтися” (“stop”)
Some corresponding WICs
lang Lexeme POS declension basis chang
e
anim gen WIC
Ukr. стоzння n 2dec hard – person a 2108
Bel. абагравfн
не
n 2dec hard – animate а 2108
Ukr. раптjвість n 2dec hard і-о inanimat
e
а 2143
Bel. легfльнас
ць
n 2dec hard – inanimat
e
а 2143
lang Lexeme POS declination basis change animal
coord
pecul WIC
Ukr. бsлий adj general hard – + – 2302
Bel. агeльны adj general hard – + – 2302
Differences within WICs• no vowel change in Belarusian in the feminine nouns on -асць:
“твjрчасць” (“creativity”), inherent in similar Ukrainian nouns on -iсть: “влeчність – влeчності” (“marksmanship”) in some indirect cases
• Ukrainian WIC №1607, masc. nouns , 2nd decl. with hard consonant stems and genitive -a flexion, without vowel change, designating inanimate objects: “гриб” ("mushroom") = similar, vowel-invariable, Belarusian entries like "марjз" (“frost”).
But: • т-ц “абанемtнт” – dat. “абанемtнце” (“season ticket”)
WIC № 1615• д-дз „пад’tзд” – dat. “пад’tздзе” (“doorway”) WIC № 1627• double change of the lexeme “снег” (“snow”) with the locative
“снtзе” and the nominative plural “снягs” WIC № 1635
Ukr. WIC > Bel. WIC
Langu
age
lexeme part of
speech
declension basis anim geniti
v
change WIC
Ukr. гриб n 2dec hard inanimate а – 1607
Bel. марjз n 2dec hard inanimate а – 1607
Bel. абанемt
нт
n 2dec hard inanimate а т-ц 1615
Bel. пад’tзд n 2dec hard inanimate а д-дз 1627
Bel. снег n 2dec hard inanimate а г-з,
е-я
1635
Ukr. WIC > Bel. WIC• appearance of the prothetic в- in some
grammatical meanings• Ukrainian WIC #1991, neutral nouns with hard
endings = Belarusian WICs #1991: “гаспадfрства” (“economy”)
But also:• “акнj” (“window”) inserted в- in the plural
(акнj – вjкны), WIC #2001• “вjзера” (“lake”) the omission of the same в-
(вjзера – азёры) WIC #2002
Ukr. WIC > Bel. WIC• Ukr. verbal class #490, "огорнeти"
(“embrace”) = Bel. WIC #490, “недацягнeць” (“fail to hold out”)
But also to:• WIC #491, ending vowel change of е-:
абамкнeць, -нe, -нёш, -нt, -нём, -няцt, -нeць (“surround”)
• WIC #494, stem vowel change а-о „абгарнeць, абгарнe, абгjрнеш” (“embrace”)
Ukr. WIC > Bel. WIC
Ukr. WIC < Bel. WIC
• Ukr. WIC as #1628, vocative change к-ч: “чоловsк” (“man”) – voc. “чоловsче” (“o, man!”)
• no vocative in modern Belarusian language• Bel. “чалавtк” (“man”) more general
class of masculine nouns on -к, Ukr. WIC #1788, “мsстык” (mystic”), no vocative change
Ukr. WIC < Bel. WIC
• two alternative forms of accusative plural in Ukrainian for nouns designating animals : nominative plural and genitive plural “пасти коні” and “пасти коней” (“to graze horses”) vs
• one form (coinciding with the genitive plural) for nouns designating people : “зустріти дівчат”, but not “*зустріти дівчата” (“to meet girls”)
• no such differentiation for Belarusian
language lexeme part of
speech
declension basis vocative
change
anim genitiv WIC
Ukr. мsстик n 2dec hard – person А 1788
Bel. мsстык n 2dec hard animate А 1788
Ukr. зfйчик n 2dec hard – animal А 1789
Bel. зfйчык n 2dec hard animate А 1788
Ukr. чоловsк n 2dec hard к-ч person А 1628
Bel. чалавtк n 2dec hard animate А 1788
Ukr. їжfк n 2dec hard к-ч animal А 1629
Bel. вjжык n 2dec hard animate А 1788
Conclusions about GDs• Statistics of usage given by a GD can help us trace more
common patterns of word-inflexion in similar classes of words, which can be useful for recommendations on standardization, considering the current variability of existing forms in both Ukrainian and Belarusian.
• Statistics of WICs can be of use in grammatical homonymy disambiguation.
• GDs can be a powerful tool for comparative studies.• GDs are corpus-driven, so they help us reveal the
information about a language that is not covered in grammars, or is not covered consistently or clearly enough for the users.