transforming a century into a corpus - uni-hamburg.de · literature • bekker, É. g. (1980)....
TRANSCRIPT
![Page 1: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/1.jpg)
Transforming a century into a corpusExperiences with Central and Southern Selkup texts
04-N
ov-
2016
1
INEL
Wo
rksh
op
201
6
![Page 2: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/2.jpg)
Outline
• Introduction
• Selkup Spoken Language Corpus (SELSLC)
• Data
• texts
• speakers
• Workflow
• FLEx
• EXMARaLDA
• Problems
04-N
ov-
2016
2
INEL
Wo
rksh
op
201
6
![Page 3: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/3.jpg)
Introduction
• DFG-project: Syntactic description of Central and Southern Selkup dialects: a corpusbased analysis (WA 3153/3-1)
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
3
Northern Selkup
Southern Selkup
![Page 4: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/4.jpg)
data: texts
• 104 texts from 35 different speakers
• all texts are previously published:
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
4
• Bajdak / Maksimova (2002, 2013, 2015)
• Bajdak / Tuchkova (2004)• Bekker (1980)• Bykonia et al (1996)• Dul’zon (1966a, 1966b)• Grigorovskiy (1879)• Katz (1988)
• Kuz’mina (1967)
• Kuz’mina (1977)• Morev et al (1981)• Szabó (1966, 1967)• Tuchkova / Helimski (2010)
• Tuchkova / Wagner-Nagy (2015)
![Page 5: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/5.jpg)
data: dialect group of speaker
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
5
Central Southern
Narym Middle Ob
Tym Upper Ob
Parabel Chaya
Vasyugan Lower Ket
Middle Ket
Upper Ket
![Page 6: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/6.jpg)
data: texts per dialect
• Central Selkup texts: 37
• Southern Selkup texts: 66
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
6
![Page 7: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/7.jpg)
data: genre of the texts
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
7
![Page 8: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/8.jpg)
data: date of birth
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
8
![Page 9: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/9.jpg)
data: date of recording
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
9
![Page 10: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/10.jpg)
data: age at recording
• average age: 56 Jahre
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
10
![Page 11: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/11.jpg)
workflow
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
11
![Page 12: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/12.jpg)
workflow: raw texts
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
12
![Page 13: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/13.jpg)
workflow: raw texts
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
13
![Page 14: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/14.jpg)
workflow: raw texts
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
14
![Page 15: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/15.jpg)
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
15
![Page 16: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/16.jpg)
workflow: transcribed texts
16
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 17: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/17.jpg)
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
17
![Page 18: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/18.jpg)
workflow: FLEx
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
18
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 19: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/19.jpg)
workflow: FLEx
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
19
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
Nov 16 Central Southern total
tokens 11.372 15.276 26.648
sentences1.729 (2.423) 2.470 (3.755) 4.199 (6.178)
71% 66% 68%
![Page 20: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/20.jpg)
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
20
![Page 21: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/21.jpg)
workflow: EXMARaLDA
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
21
![Page 22: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/22.jpg)
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
22
![Page 23: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/23.jpg)
problems
• many different spellings of one and the same suffix / lexem (no orthographie)
• example: paldʼu ‘goʼ -> 18 allomorphes
palʼdö (1)palʼcʼe (1)paldzʼi (1)palʼdʼü (2)paldu (1)palʼde (1)
pald’ü (2)palʼdʼu (18)palʼdʼi (11)palʼdi (4)paldi (7)palʼdʼe (7)
(62) occurences in Southern dialects no occurence in Central
variantspalʼcʼ (1)palʼčʼə (1)palʼčʼu (1)palč (1)paldʼzʼi (1)palʼdʼzʼi (1)
➔ Alatalo 2004 (687) -> palc’u [kam.]
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
23
![Page 24: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/24.jpg)
problems
• many different spellings of one and the same suffix / lexem (no orthographie)
• example: maːt ‘house, tent, at homeʼ -> 8 allomorphes
mаːt (15|7)mat (64|18)maːd (9|5)mad (6|6)matə (4)
(104|40) occurences in Southern dialects no occurence in Central
variants
madʼe (2)
maːn (1)
maːtə (2)matɨ (2)ma (2)mas (1)
24
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 25: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/25.jpg)
problems
• example: suffix -sɨ (past marker) -> 26 allomorphes
-sɨ (8|1)
-saː (3|1)
-sa (110|11)
-se (4)
-sä (1)
-si (1)
-sə (4)
-s (181|1)
-ss (1)
-ssa (11)
-ssü (1)
-sse (1)
-ssi (3)
-ssaː (2)
-za (134)
-zaː (13)
-zə (3)
-ze (1)
-zɨ (28)
-z (1)
-zü (1)
-zo (1)
-zʼa (4)
-zʼaː (4)
-c (2)
-sʼa (2)
-sʼi (3)
(520|16) -> occurences in Southern|Central dialects
variants
-ša (1)
-ʒa (1)
-ʒe (1)
25
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 26: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/26.jpg)
problems
• dictionary entries -> which entry is the “basic wordform”• example: paqqə ‘digʼ
paqqɨ (Ob.Sh)paqqɛːptɨ (Ob.Sh)paqqɛlgu (Ket)paqqəl (Ob.S, Ch, Ket)paqlešpu (Ob.Ch)paqəlešpu (Ob.Sh)paqɨl (Ob, Ket, Vas, Tym)paqəl (Ob.Ch, Ket)paqəlešpu (Ob.Sh)
paqqəl
Bykonia Kuper/Pusztay Helimski (Grigorovskij)
paqqyl
26
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 27: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/27.jpg)
problems
• how do we solve it?
paqqɨ (Ob.Sh)paqqɛːptɨ (Ob.Sh)paqqɛlgu (Ket)
-ptɨ -> causative-l, -le -> inchoative-gu -> iterative, reflexive, DRV-špu -> imperfective verbal, habituative, durative
paqlešpu (Ob.Ch)paqəlešpu (Ob.Sh)paqɨl (Ob, Ket, Vas, Tym)
➔ which dialect determines the entry?➔ which spelling do we use?
27
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 28: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/28.jpg)
literature
• Bykonia, V. V. (2005). Sel’kupsko-russkiy dialektnyj slovar’. Tomsk• Helimski, E. (2007). Yuzhnosel’kupskij slovar’ N. P. Grigorovskogo. Hamburg• Kuper, Sh. and J. Pusztay (1993). Sel’kupskiy razgovornik (narymskiy dialekt)• Bajdak, A. V. and N. j. P. Maksimova (2002). Didaktizacija original'nogo teksta.
Sel'kupskij jazyk (uchebno-metodicheskij komplekt k uvchebnomu modulju).
Tomsk
• Bajdak, A. V. et al. (2013). Sel’kupskie teksty. Sbornik annotirovannykh
fol’klornykh i bytovykh tekstov jazykov obsko-enisejskogo jazykovogo areala 3.
Tomsk. pp. 153-201
• Bajdak, A. V. et al. (2015). Sel’kupskie teksty. Sbornik annotirovannykh
fol’klornykh i bytovykh tekstov jazykov obsko-enisejskogo jazykovogo areala 4.
Tomsk. pp. 108-149.
• Bajdak, A. V. and N. A. Tuchkova (2004). Episody "Eposa ob It’t’e" v
chumil’kupskom dialektnom areale. Korennye narody Sibiri: problem
istoriografii, istorii, etnografii, lingvistiki. Tomsk. pp. 51-64.
28
04-N
ov-
2016
INEL
Wo
rksh
op
201
6
![Page 29: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/29.jpg)
literature
• Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa
[Teksty i per.] 3. T. I. Porotova and A. P. Dul'zon. Tomsk. pp. 55-71.
• Bykonja, V. V., et al. (1996). Skazki narymskikh sel’kupov (kniga dlja
chtenija na sel’kupskom jazyke s perevodami na russkij jazyk) ; k 400-letiju
Naryma. Tomsk, AO Izdat. NTL.
• Dul'zon, A. P. (1966). Ketskie skazki. Tomsk, Izd. Univ.
• Dul’zon, A. P. (1966). “Sel’kupskie skazki”. Jazyki i toponimija Sibiri. Tomsk.
Izd. Tomskogo Univ. 97-158.
• Grigorovskij, N. P. (1879). Azbuka sjussogoj gulani: dlja innostrancev’
Narymskogo kraja. Kazan’.
• Kuz'mina, A. I. (1967). "Dialektologicheskie materialy po sel’kupskomu
jazyku." Issledovanija po jazyku i fol’kloru 2: 267-329.
• Kuz'mina, A. I. (1976). K etymologii nazvanij mesjacev, storon sveta, zvjozd
i sozvesdij v sel’kupskom jazyke. Jazyki i toponimija 4. Tomsk, Izd.
Tomskogo Universiteta. 71-86.
![Page 30: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/30.jpg)
literature
• Morev, J. A., et al. (1981). Sel’kupskie teksty. Skazki narodov Sibirskogo
Severa [Teksty i per.] 4. T. I. Porotova, A. P. Dul’zon and T. I. Porotova.
Tomsk, Izd-vo Tom. un-ta: 122-143.
• Szabó, L. (1966). 254-255
• Szabó, L. (1967). Selkup texts with phonetic introduction and vocabulary.
The Hague, Mouton.
• Tuchkоvа, N. j. A. e., et al. (2010). O materialah A. I. Kuz'minoj po
sel'kupskomu jazyku [=Über die selkupischen Sprachmaterialien von
Angelina I. Kuz'mina]. Hamburg [Institut für Finnougristik/Uralistik].
• Tuchkova, N. A. and B. Wagner-Nagy (2015). se ̄l’d’e nu ̄n qö ̄di i ̄t’t’e …
(Itja-teksty). Tomsk, Izdat. Tomsk. Gos. Ped. Univ.
![Page 31: Transforming a century into a corpus - uni-hamburg.de · literature • Bekker, É. G. (1980). Sel’kupskie teksty. Skazki narodov Sibirskogo Severa [Teksty i per.] 3. T. I. Porotova](https://reader034.vdocument.in/reader034/viewer/2022042410/5f27724d826b6d4e920f218c/html5/thumbnails/31.jpg)
Thank you!
31
04-N
ov-
2016
INEL
Wo
rksh
op
201
6