german - uni-tuebingen.dekdk/ws09-10/focus-marking... · 2010. 2. 2. · 2.1.1. reading material...
TRANSCRIPT
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Focus Marking in German
Kordula De Kuthy
HS Neuere Arbeiten zur Fokusprojektion WS 09/10
February 2, 2010
1 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Prosodic Marking of Focus DomainsCategorial or Gradient
(1) a. Q: Who did you call?
b. A: [I called]background [MAry]F
(2) a. Q: Did you call John?
b. A: No, [I called]background [MAry]F
(3) a. Q: What happened?
b. A: [I called MAry]F
I The differences between answers in (1b), (2b), and (3b)are discrete:
I it is either MAry or I called Mary which is in focus andI MAry is either contrasted with another specific person,
or is singled out from a larger set.I Are these differences marked prosodically, andI does the prosodic marking involve
I discrete means, i.e. phonological categories such aspitch accent type, or
I gradient means, such as duration, or F0 timing andscaling differences (which do not lead to a difference inphonological categories.
2 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Prosodic marking of broad vs narrow focus inGerman
I Féry (1993) looked for categorial distinctions in theprosodic marking of broad versus narrow focus inGerman.
I The result of an production experiment revealed thatspeakers used the same nuclear pitch accent type(H*L) in both broad and narrow focus as in (4) and (5).
(4) a. Q: Was ist los?
b. A: [ANna ist weggelaufen.]F
(5) a. Q: Wer ist weggelaufen?
b. A: [ANna]F [ist weggelaufen.]background
3 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Production experimentBaumann et al. (2006) design a production experiment toinvestigate whetherI prosodic means are used in German to differentiate
between three sizes of focus domains involving focusprojection and
I between these and narrow focus andI between narrow focus and contrastive focus
position), in German have later and higher peak placementthan non-contrastive ones [3].
Furthermore, stressed vowels have a significantly longerduration in contrastive themes. Similarly, Second OccurrenceFocus (SOF; [10]) involves a longer duration of the targetword. SOF is induced by a focus operator such as even afterthe main focus of the phrase, that is, in the unaccented stretchfollowing the nuclear accent.
In both of the above cases gradient means are used toexpress a binary opposition: contrastive or non-contrastivethemes; SOF or no focus. In our own study, the size of focusdomains can be seen as discrete but not necessarily binary - asthe size of focus domains can be extended step by step toinclude more and more constituents, bounded only bysentence length. Contrastive or non-contrastive narrow focus,on the other hand, can be considered a binary distinction,although not all theories distinguish narrow focus fromcontrast, since narrow focus is also contrastive in some way[20].
2. Production experiment
2.1. Recordings
A production experiment was designed to investigate whetherprosodic means are used in German to differentiate betweenthree different sizes of focus domain involving focusprojection, and between these and narrow focus, and, withinthe narrow focus cateogory, contrastive focus. Our hypothesesare based on the fact that gradient variation has been found toexpress other differences in information structure (see 1.2).However, this variation does not preclude a categorical dis-tinction e.g. in pitch accent type.
2.1.1. Reading material
Reading material consisted of five question-answer pairs withthe answer ‘Manuela will Blumen malen.’ (Manuela wants topaint flowers.). The main criterion the target sentence had tofulfill was its continuous voicing, so as to be able toaccurately measure exact peaks and valleys in the F0 contour.The questions are listed below, followed by the focus domainsaccording to question-answer congruence.
Questions:
1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela? What about Manuela?3. Was will Manuela? What does Manuela want?4. Was will Manuela malen? What does Manuela want topaint?5. Manuela will Gesichter malen? Manuela wants to paintfaces?
Answers: Manuela will Blumen malen.
1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5. Nein, [ ] focus contrastive
lit.: Manuela wants flowers paint
2.1.2. Speakers and recording procedure
Six speakers (three female, three male) between the ages of23 and 27 took part in the experiment. All of them werestudents at the University of Cologne. Four speakersoriginated from the north-west of Germany, one from the west(just below the Benrath isogloss), and one from the north ofBavaria.
The recordings were carried out in a soundproof room,with the instructor reading out the questions, and the subjectsgiving the answers. The five sentences were interspersed withfillers and read aloud four times in randomised orders by eachspeaker, leading to 20 tokens per speaker. Thus, 120utterances in total entered the analysis.
2.2. Analysis
Using the speech analysis tool EMU [5], we labelled the onsetand the end of the nuclear word (which was the word Blumenin all cases), and the start and end of each segment. Theprenuclear and nuclear pitch accents were transcribed inGToBI [11] with an additional label for the beginning of thenuclear rise. Example contours are given in Fig.1.
Figure 1: Example F0 contours for broad and narrow focus(answers 1 and 4, speaker CB)
3. Results and discussion
3.1. Categorical means
As a first result, contrary to predictions in the literature, boththe size of the focus domain and type of focus affect thechoice of accent type on the focus exponent: in broad(er)focus structures (sentences 1 and 2) a downstepped nuclearaccent was produced in 42% of all cases, while in narrowerfocus domains (sentences 3 and 4) fewer downsteps occurred(25% and 17%, respectively). In contrastively focussedutterances no downstep was produced at all (Fig.2).
A Spearman’s Rho correlation analysis showed asignificant interaction between nuclear pitch accent type andsentence type (p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Labeling of the resulting data
position), in German have later and higher peak placementthan non-contrastive ones [3].
Furthermore, stressed vowels have a significantly longerduration in contrastive themes. Similarly, Second OccurrenceFocus (SOF; [10]) involves a longer duration of the targetword. SOF is induced by a focus operator such as even afterthe main focus of the phrase, that is, in the unaccented stretchfollowing the nuclear accent.
In both of the above cases gradient means are used toexpress a binary opposition: contrastive or non-contrastivethemes; SOF or no focus. In our own study, the size of focusdomains can be seen as discrete but not necessarily binary - asthe size of focus domains can be extended step by step toinclude more and more constituents, bounded only bysentence length. Contrastive or non-contrastive narrow focus,on the other hand, can be considered a binary distinction,although not all theories distinguish narrow focus fromcontrast, since narrow focus is also contrastive in some way[20].
2. Production experiment
2.1. Recordings
A production experiment was designed to investigate whetherprosodic means are used in German to differentiate betweenthree different sizes of focus domain involving focusprojection, and between these and narrow focus, and, withinthe narrow focus cateogory, contrastive focus. Our hypothesesare based on the fact that gradient variation has been found toexpress other differences in information structure (see 1.2).However, this variation does not preclude a categorical dis-tinction e.g. in pitch accent type.
2.1.1. Reading material
Reading material consisted of five question-answer pairs withthe answer ‘Manuela will Blumen malen.’ (Manuela wants topaint flowers.). The main criterion the target sentence had tofulfill was its continuous voicing, so as to be able toaccurately measure exact peaks and valleys in the F0 contour.The questions are listed below, followed by the focus domainsaccording to question-answer congruence.
Questions:
1. Was gibt’s Neues? What’s new?2. Was gibt’s Neues von Manuela? What about Manuela?3. Was will Manuela? What does Manuela want?4. Was will Manuela malen? What does Manuela want topaint?5. Manuela will Gesichter malen? Manuela wants to paintfaces?
Answers: Manuela will Blumen malen.
1. [ ] focus broad2. [ ] focus3. [ ] focus4. [ ] focus narrow5. Nein, [ ] focus contrastive
lit.: Manuela wants flowers paint
2.1.2. Speakers and recording procedure
Six speakers (three female, three male) between the ages of23 and 27 took part in the experiment. All of them werestudents at the University of Cologne. Four speakersoriginated from the north-west of Germany, one from the west(just below the Benrath isogloss), and one from the north ofBavaria.
The recordings were carried out in a soundproof room,with the instructor reading out the questions, and the subjectsgiving the answers. The five sentences were interspersed withfillers and read aloud four times in randomised orders by eachspeaker, leading to 20 tokens per speaker. Thus, 120utterances in total entered the analysis.
2.2. Analysis
Using the speech analysis tool EMU [5], we labelled the onsetand the end of the nuclear word (which was the word Blumenin all cases), and the start and end of each segment. Theprenuclear and nuclear pitch accents were transcribed inGToBI [11] with an additional label for the beginning of thenuclear rise. Example contours are given in Fig.1.
Figure 1: Example F0 contours for broad and narrow focus(answers 1 and 4, speaker CB)
3. Results and discussion
3.1. Categorical means
As a first result, contrary to predictions in the literature, boththe size of the focus domain and type of focus affect thechoice of accent type on the focus exponent: in broad(er)focus structures (sentences 1 and 2) a downstepped nuclearaccent was produced in 42% of all cases, while in narrowerfocus domains (sentences 3 and 4) fewer downsteps occurred(25% and 17%, respectively). In contrastively focussedutterances no downstep was produced at all (Fig.2).
A Spearman’s Rho correlation analysis showed asignificant interaction between nuclear pitch accent type andsentence type (p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Higher accent peaks
I Two speaker show a highly significant effect of nuclearaccent pitch hight on sentence type.
support the finding of [1] for English that !H* was perceivedas significantly less prominent than L+H* or H*.
0
20
40
60
80
100
nuclear
pitch
accent
type (%)
1 2 3 4 5
sentence type
downstep
no downstep
Figure 2: Differences in nuclear pitch accent type in relationto sentence type, all speakers (N=120)
In 20% of all cases there was no prenuclear H tone, sincethe nuclear accent was the only accent in the phrase (78% ofthe prenuclear accents were of the type (L+)H* and 2% of thetype L*+H). Half of the single-accent phrases occurred incontrastive utterances, which is in line with the observation of[7] that the prominence of an accent can be increased bydeaccenting other words in the phrase.
The observation already made by [3] in their investigationfor contrastive and non-contrastive themes in German thatspeakers vary considerably as to the (combination of) meansthey employ for signalling aspects of information structure, issupported by our data.
As for the use of different accent types, for example, fourout of six speakers use downstepped nuclear accents formarking broad focus and non-downstepped peak accents formarking narrow and, in particular, contrastive focus. Theother two speakers do not use downstepping contours at all,i.e. all prenuclear and nuclear accents were of the type(L+)H*.
3.2. Gradient means
As the focus domain narrows, we also observe the use of thefollowing gradient means:
a) increased duration of the focus exponentb) higher peak on the nuclear accent (marking the
focus exponent)c) greater pitch excursion to the peak of the nuclear accentd) delay in the nuclear accent peak.
Across all speakers, duration varied consistently with the sizeof focus domain but it did not distinguish between contrastand non-contrast (Fig.3). In a one-way ANOVA with sentencetype as independent factor, the focus domain had a highlysignificant effect on the duration of the focus exponent(p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
The Experiment
Articulatory gestures and focus marking in German
Anne Hermes, Johannes Becker, Doris Mücke, Stefan Baumann & Martine Grice
IfL Phonetik, University of Cologne, Germany {anne.hermes; becker.johannes; doris.muecke; stefan.baumann; martine.grice}@uni-koeln.de
Abstract
This study reports on a production experiment investigating
tonal and articulatory means of encoding different focus
structures in German. Using an electromagnetic articulograph,
we examined the movements of the upper and lower lips
(related to sonority expansion) during the production of target
words occurring in four different focus conditions. We found
systematic differences not only between unaccented vs.
accented target words (background vs. contrastive focus), but
also within the category ‘accented’: the differences in
articulatory expression for broad vs. contrastive focus were
expressed by greater displacements and lower stiffness of lip
aperture (opening and closing movements). Our results
suggest that German speakers express discrete linguistic
differences, namely differences in focus structure, by
gradually but systematically varying sonority expansion in
focus exponents across consonants and adjacent vowels, thus
enhancing the syntagmatic contrast.
1. Introduction
In most studies dealing with information structure, focus and
background are regarded as two distinct categories.
Consequently, it is often assumed that the prosodic marking
of these categories should be categorically distinct as well,
thus reducing the prosodic analysis to the question of whether
a constituent is accented or unaccented. More recent studies
(e.g. [2]) have shown, however, that different focus structures
are encoded by different accent types and/or by varying
continuous parameters such as duration or pitch excursion on
the focus exponents, thus creating different degrees of
prominence on the respective items.
In the present study we are primarily concerned with the
role of articulatory gestures in focus marking. The few
previous investigations in this field are restricted to words in
maximally diverging focus structures (contrastive focus vs.
background) and thus to the accented-unaccented dichotomy
(e.g. [5] for English and [1] for Italian). It is unclear from
these studies, however, whether the articulatory differences
found (e.g. greater jaw lowering or lip aperture in contrastive
focus) are simply due to accentuation or whether the
articulatory expression of different focus structures can be
regarded as a continuum of prominence or emphasis (as
reported in [6] for French).
In order to shed light on this question, we explore the
variation in articulatory parameters which are related to lip
kinematics (greater displacement, longer duration, higher
peak velocity and lower stiffness of lip opening to enhance
prominence) in the marking of target words occurring in
different types of focus (contrastive, non-contrastive) and
different sizes of focus domain (broad, narrow; see e.g. [9]),
or in the background. In particular, we investigate differences
within the category ‘accent’ (broad vs. narrow focus, broad
vs. contrastive focus) as well as between accented and
unaccented words (contrastive focus vs. background). Results
on differences between background and broad focus as well
as narrow and contrastive focus are not presented here.
1.1. Reading material
The speech material included question-answer sets eliciting
four different focus structures: the NP under investigation
occurred either as part of the previously mentioned
background or in broad, narrow or contrastive focus. The
target words, i.e. the fictitious names after Dr. !"#$%&',
were always disyllabic, with the stressed syllable containing
one of the four long target vowels /i!/, /a!/, /o!/ or /u!/. An
example of a question-answer set is given below:
Questions:
1. Will Norbert Dr. Bahber treffen? Does Norbert want to
meet Dr. Bahber?
2. Was gibt´s Neues? What´s new?
3. Wen will Melanie treffen? Whom does Melanie want to
meet?
4. Will Melanie Dr. Werner treffen? Does Melanie want to
meet Dr. Werner?
Answers: test word in:
Melanie will Dr. Bahber treffen.
1. [ ]focus background
2. [ ]focus broad focus
3. [ ]focus narrow focus
4. [ ]focus contrastive focus
(lit.: Melanie wants Dr. Bahber to-meet)
1.2. Speakers and recordings
Three native speakers of Standard German (aged 26, 27 and
37) were recorded with a 2D Electromagnetic Midsagittal
Articulograph (EMMA) and a time-synchronized DAT-
recorder. The kinematic data were recorded at 500Hz,
downsampled to 200Hz and smoothed with a 40Hz low-pass
filter. The acoustic data were digitized at 44.1kHz.
The subjects listened to the questions (which were
presented both visually and auditorily) and were instructed to
answer these questions in a contextually appropriate manner
and at a normal speech rate. After a test block of five
question-answer-pairs each subject read out the target
sentences (four focus structures, four target words, seven
repetitions) in pseudo-randomised order, leading to 112
tokens per speaker in total.
Lip movements were monitored by EMMA (Carstens
AG100), with sensors placed on the vermillion border of the
upper and lower lip within the midsagittal plane. Two
additional sensors on the nose and the upper gums served as a
reference in order to correct helmet movements during the
recordings.
13 / 17
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Labeling of the data
200
300
400
DM AH WP
background broad narrow contrastive
*** ***
*** ***
*** ***
!" #$ " %&
!" #$ " %&
1.3. Labelling procedure
Acoustic and articulatory data were labelled by hand using the EMU speech database system. A screen shot including all tiers and labels described below is given in Fig.1.
Segment boundaries of consonants and vowels of the accented and post-accented syllables (c1, v1, c2, v2) were annotated in the acoustic waveform.
In the tonal analysis we identified three different GToBI accent types on the target word (as proposed in [2]): !H* (downstep), ^H* (upstep) or H* (neither downstep nor upstep). Note that up- and downsteps are always related to a preceding prenuclear LH accent on the subject argument. Deaccentuation of the target word was marked with ‘Ø’.
For the kinematic data, the lip aperture (LA) index was calculated in terms of the Euclidean distance between the two sensors of upper and lower lip, including movements both in the horizontal and vertical dimension [4]. Minima and maxima of opening and closing gestures (min1, max1, min2, max2) were located at zero-crossings in the respective velocity trace. Additionally, we labelled peak velocities at zero-crossings in the respective acceleration trace (p1, p2, p3). Twelve utterances (all from speaker WP) were removed from the analysis because no clear turning points for the lip kinematics could be identified.
Figure 1: Labelling scheme; from top to bottom: oscillogram, F0 curve, velocity and position curve
of lip aperture (LA); target word B/i:/ber.
2. Results and Discussion
We analysed all measures with one-way-ANOVAs for each speaker separately and with a Tukey post hoc test. The dependent variables included accent type and word duration for the acoustic measures, and displacement, peak velocity, duration and stiffness for the articulatory measures. The independent variable FOCUS STRUCTURE included broad focus, narrow focus, contrastive focus and background.
2.1. Accent types
Table 1 shows the accent types preferably used by the three speakers in the different focus structures. As expected, all speakers deaccented the target words when they occurred in the background. In broad focus, speakers DM and AH almost exclusively used downsteps (DM 85.2%; AH 100%), whereas
speaker WP typically produced upsteps (84%; only 4% downsteps). In the narrow focus condition, speakers DM and WP both produced upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent types nearly to the same extent (36% upsteps; 32% downsteps; 32% unmodified H*). In contrastive focus, all three speakers always used upsteps.
Table 1: Most frequently produced accent types per speaker and focus condition.
2.2. Acoustic durations
We examined the duration of the target words for all speakers. Since our target words are disyllabic, the domain ‘word’ is identical with the domain ‘foot’. Fig.2 shows mean durations of the target word B/i:/ber for the different focus conditions. For all three speakers, we found a significant increase in word duration from background to contrastive focus (e.g. AH: 33ms longer, p
-
Focus Marking inGerman
Categorial andgradient prosody(Baumann et al.2006)
Results anddiscussion
Articulatorygestures and focusmarking
The experiment
Results anddiscussion
Kinematic resultsI Kinematic results are presented for two speakers for the
vowel /i:/.I The figure shows averaged trajectories for the distance
between upper and lower lip during the production ofthe target word, for each focus condition separately.
I Low displacements indicate that the lips are closed forthe production of the stop consonants.
I High values indicate open lips during the vowels.I Going from background through broad and narrow to
contrastive focus there is an increase in duration and lipaperture
200
300
400
DM AH WP
background broad narrow contrastive
*** ***
*** ***
*** ***
!" #$ " %&
!" #$ " %&
1.3. Labelling procedure
Acoustic and articulatory data were labelled by hand using the EMU speech database system. A screen shot including all tiers and labels described below is given in Fig.1.
Segment boundaries of consonants and vowels of the accented and post-accented syllables (c1, v1, c2, v2) were annotated in the acoustic waveform.
In the tonal analysis we identified three different GToBI accent types on the target word (as proposed in [2]): !H* (downstep), ^H* (upstep) or H* (neither downstep nor upstep). Note that up- and downsteps are always related to a preceding prenuclear LH accent on the subject argument. Deaccentuation of the target word was marked with ‘Ø’.
For the kinematic data, the lip aperture (LA) index was calculated in terms of the Euclidean distance between the two sensors of upper and lower lip, including movements both in the horizontal and vertical dimension [4]. Minima and maxima of opening and closing gestures (min1, max1, min2, max2) were located at zero-crossings in the respective velocity trace. Additionally, we labelled peak velocities at zero-crossings in the respective acceleration trace (p1, p2, p3). Twelve utterances (all from speaker WP) were removed from the analysis because no clear turning points for the lip kinematics could be identified.
Figure 1: Labelling scheme; from top to bottom: oscillogram, F0 curve, velocity and position curve
of lip aperture (LA); target word B/i:/ber.
2. Results and Discussion
We analysed all measures with one-way-ANOVAs for each speaker separately and with a Tukey post hoc test. The dependent variables included accent type and word duration for the acoustic measures, and displacement, peak velocity, duration and stiffness for the articulatory measures. The independent variable FOCUS STRUCTURE included broad focus, narrow focus, contrastive focus and background.
2.1. Accent types
Table 1 shows the accent types preferably used by the three speakers in the different focus structures. As expected, all speakers deaccented the target words when they occurred in the background. In broad focus, speakers DM and AH almost exclusively used downsteps (DM 85.2%; AH 100%), whereas
speaker WP typically produced upsteps (84%; only 4% downsteps). In the narrow focus condition, speakers DM and WP both produced upsteps (DM 82.6%; WP 100%), while speaker AH used all three accent types nearly to the same extent (36% upsteps; 32% downsteps; 32% unmodified H*). In contrastive focus, all three speakers always used upsteps.
Table 1: Most frequently produced accent types per speaker and focus condition.
2.2. Acoustic durations
We examined the duration of the target words for all speakers. Since our target words are disyllabic, the domain ‘word’ is identical with the domain ‘foot’. Fig.2 shows mean durations of the target word B/i:/ber for the different focus conditions. For all three speakers, we found a significant increase in word duration from background to contrastive focus (e.g. AH: 33ms longer, p