1 rule reliability and productivity velar palatalization in russian and artificial grammar vsevolod...
TRANSCRIPT
1
Rule reliability and productivity
Velar palatalization in Russian and artificial grammar
Vsevolod KapatsinskiIndiana University
[email protected]://mypage.indiana.edu/~vkapatsi/
Laboratory Phonology XI30 June – 2 July 2008
Work supported by NIH Training Grant DC-00012and NIH Research Grant DC-00111
2
The puzzle of productivity loss
• Morphophonemic rules can lose productivity while having no exceptions in the lexicon
• How does this happen? If there are a lot of examples supporting a rule, why would it fail?
3
Case study: Velar palatalization in Russian
kt /_ -i (verbal stem extension)g -ek/ik (nominal diminutive)
-ok (nominal diminutive)
Exceptionless in the lexicon (Levikova 2003, Sheveleva 1974)
Fully productive before -ek and -ok.butPartially productive before –i and -ik.
Why?
4
Hypothesis
• Rules are extracted from the lexicon
• Rules compete for inputs
• Competition is resolved by relative reliability
• Reliability = number of inputs that undergo the rule divided by the number of inputs that could undergo the rule
(Albright and Hayes 2003, Pierrehumbert 2006)
For [] ed , # of verbs that take –ed / # of verbs in English
5
Rule-Based Learner(Albright and Hayes 2003)
• Takes in a lexicon of pairs of morphologically related words
blok, bloti-sok, soti-sobak, sobati-zavtrak, zavtraka-
• Generalizes rules from it and weights them by reliability
k ti / o_ (1.0)
k ti / V[+back;-high]__(0.75)
[] a / ak_ (0.5)
6
Rule-Based Learner(Albright and Hayes 2003)
• Generalizes rules from it and weights them by reliabilityk ti / o_ (1.0)
k ti / V[+back;-high]__(0.75)
[] a / ak_ (0.5)• For each distinct output that an input can become, there will be one rule
that’s more reliable than other rules producing that output from that inputbok boti
k ti / o_ (1.0)k ti / V[+back;-high]__(0.75)
• The probability of an output given an input is given by dividing the reliability of the most reliable applicable rule producing that output by the sum of reliabilities of the most reliable rules leading to different outputs
bok boti 1/(1+0.5) = 67%boka 0.5/(1+0.5) = 33%
7
blok, bloti-sok, soti-lak, lati-zavtrak, zavtraka-
k ti / o_ (1.0)
k ti / V[+back;-high]__(0.75)
[] a / ak_ (0.5)
bak bati 0.75/(0.75+0.5) = 60%baka 0.5/(0.5+0.75) = 40%*baki palatalization never fails
before -i
8
blok, bloti-sok, soti-sobak, sobati-zavtrak, zavtraka-
k ti / o_ (1.0)
k ti / V[+back;-high]__(0.75)
[] i / C_ (0.69)
[] a / ak_ (0.5)
plat plati-kos kosi-trub trubi-var vari-ver veri-sol soli-voz vozi-sor sori-ar ari-
bak bati 0.75/(0.75+0.5+0.69) = 39%baka 0.5/(0.5+0.75+0.69) = 26%baki 0.69/(0.5+0.75+0.69) = 36% palatalization
fails
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived froma velar-final input
and bearing -i
New inputs that endin a velar
and take -i
Stored words derived froma non-velar input
and bearing -i
-ek-ok
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived froma velar-final input
and bearing -i
New inputs that endin a velar
and take -i
Stored words derived froma non-velar input
and bearing -i
-i is preceded by an alveopalatal in the output
-i is preceded by a velar in the output
Stored words derived froma velar-final input
and bearing -i
New inputs that endin a velar
and take -i
-i-ik
Stored words derived froma non-velar input
and bearing -i
12
Testing the hypothesis• Borrowings from English in online communication
– Inputs:• Take all verbs and nouns that end in /k/ or /g/ from the British National Corpus, e.g.,
lock• Plus a sample of verbs and nouns ending in other stops (for nouns, matched preceding
vowel proportions)
– Outputs:• Choose suffix
– For a verb, -i, -a, or –ova– For a noun, -ik, -ek, or –ok
• Choose whether to change the stem– For a verb: lokatj, lokovatj, lotitj, lokitj, – For a noun: lotok, lokok, lotek, lokek, lotik, lokik
– Count:• Submit the possible outputs to Google• Rate of vel.pal. failure: lokitj / (lotitj + lokitj)
56 velar-final, 140 non-velar-final20 velar-final, 40 non-velar-final
13
Results: Stem extensions
Velars favor –a over –i while –i is favored elsewhere
Likelihood of taking -i
Velar-final Labial-final Coronal-finalBase
14
Results: Stem extensions
Velar palatalization is likely to fail before –i despite being exceptionless; AND –i is favored by non-velar-final inputs
Mean44%
15
Results: Diminutives
Mean 0%Mean 1% Mean 35%-ik is favored bynon-velars
-ok and –ek are favored by velars
Velar palatalizationfails only before -ik
16
Results: Diminutives
Mean 0%Mean 1% Mean 35%-ik is favored bynon-velars
-ok and –ek are favored by velars
Velar palatalizationfails only before -ik
g
k
p,b,t,d
-ek -ik -ok
Mean 10%Mean 0% Mean 100%
17
Evidence from artificial grammar
• Issue:• speakers avoid using –i after velars because vel.pal.
is unproductive before –i
OR
• vel.pal. is unproductive before –i because
-i is mostly used after non-velars
18
Evidence from artificial grammar
• Native English speakers exposed to two artificial languages: Language
BLUE RED {k;g}{t;d}i 100%
30{t;d;p;b} {t;d;p;b}i 25% 75%
8 24{t;d;p;b} {t;d;p;b}a 75% 25% 24 8
24
Results
Rate of velarpalatalizationis lowerin Red Languagethan in Blue Language
Prediction confirmed
*
100%30
BLUE RED
25
Results
The more productive-i is with non-velar-finalinputs for a subject,the less productive isvelar palatalization forthe same subject.
***
Constraining the model:Processing stages
• Two-stage model:– Stage I:
-i vs. –a– Stage II:
g vs. ‘do nothing’• One-stage model:
– g i vs.– g ga vs.– C Ci
27
Context effects
Velar palatalization is likely to fail before –i despite being exceptionless
Mean44%
28
Explaining context effects• Context effects are due to differences in the relative reliabilities of specific
velar-changing rules
g i/V[+back;-high]_ (.475)log: .475 vs. .232
g i/V[-high]_ (.350)
g i/V_ (.272)g i/[+voice]_ (.195) ping: .195 vs. .232
[] i/C[+voiced]_ (.232)
Suppose that the decision on whether to change the stem is made in the context of an already chosen suffix (-i)
In this context, all velar-changing rules are completely reliable (they are exceptionless).Thus, relative reliability predicts context effects only if the suffix and the stem change are chosen
simultaneously.
g /V[+back;-high]_i (1.0) log: 1.0 vs. .756
g /V[-high]_i (1.0)g /V_i (1.0)g /[+voice]_i (1.0) ping: 1.0 vs. .756
[] []/C[+voiced]_i (.756)
29
Constraining the model:Decision rule
• Rule-Based Learner relies on a stochastic decision between competing rules
• The speaker cannot go for the most reliable rule all the time– The most reliable rule in both the blue language and the
red language is palatalizing the L’s should not differ– Albright and Hayes (2003)
• Novel verbs that are similar to many regular English verbs are more likely to take the regular past tense than novel verbs that are similar to neither regular nor irregular English verbs
• Regular rule is the most reliable one in both cases• The two classes of words should not differ
30
If• Rules compete• The outcome of competition is influenced by reliability (Albright and Hayes
2003, Pierrehumbert 2006)• Known words are retrieved from the lexicon not generated by the
grammarThen• An exceptionless rule loses productivity but can remain exceptionless if
the triggering affix comes to be used mostly with segments that cannot undergo the rule.
To account for the present results,• Competition between rules must be resolved stochastically.• The suffix and the stem shape must be chosen during a single decision
stage.
Summary
31
ReferencesAlbright, A., and B. Hayes. 2003. Rules vs. analogy in English past tenses: A
computational / experimental study Cognition, 90, 119-61.Bybee, J., and J. Newman. 1995. Are stem changes as natural as affixes?
Linguistics, 33, 633-54.Kapatsinski, V. M. 2005. Characteristics of a rule-based default are
dissociable: Evidence against the Dual Mechanism Model. In S. Franks, F. Y. Gladney, and M. Tasseva-Kurktchieva, eds. Formal Approaches to Slavic Linguistics 13: The South Carolina Meeting, 136-46. Ann Arbor, MI: Michigan Slavic Publications.
Levikova, S. I. 2003. Bol’shoj slovar’ molodezhnogo slenga. [The big dictionary of youth slang]. Moscow: Fair-Press.
Pierrehumbert, J. B. 2006. The statistical basis of an unnatural alternation. In L. Goldstein, D.H. Whalen, and C. Best (eds), Laboratory Phonology VIII: Varieties of Phonological Competence, 81-107. Berlin: Mouton de Gruyter.
Sheveleva, M. S. 1974. Obratnyj slovar’ russkogo jazyka. [Reverse dictionary
of Russian]. Moscow: Sovetskaja Enciklopedija.