![Page 1: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/1.jpg)
Building a sentential modelBuilding a sentential modelforfor
automatic prosody evaluationautomatic prosody evaluation
Kyuchul YoonSchool of English Language & Literature
Yeungnam University2009.06.19
Korea University
Part A
![Page 2: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/2.jpg)
English pronunciation evaluation
English pronunciation proficiency evaluation– Ultimate goals
• Evaluation at– The segmental level
– The suprasegmental level
– Current goals• Evaluation at
– The suprasegmental level
Introduction
![Page 3: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/3.jpg)
English pronunciation evaluation
The goal of present study– Prosody evaluation of a single target utterance
• Produced by a Korean student
• Given– An English target sentence
– A sentential model for prosody evaluation
Introduction
![Page 4: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/4.jpg)
Manual vs. automatic
Problems of manual evaluation– What to evaluate– How to evaluate– Consistency
Problems of automatic evaluation– How to reflect human knowledge
Introduction
![Page 5: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/5.jpg)
Manual vs. automatic A possible solution?
– Avoid knowledge-based abstraction• Compare a target utterance with
native speakers’ utterances
– Use multiple utterances for comparison• Multiple “good” utterances from native speakers
– Adopt raw values• Calculate difference values between the target and the “good”
utterances in terms of – The three prosodic aspects : F0, intensity, durations 3D coordinates
Introduction
![Page 6: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/6.jpg)
How to build the model
Use multivariate statistical analysis– A discriminant analysis
The components of the model(The segmental proficiency scores controlled)
– The manual prosody evaluation scores (response)– The automatic prosody evaluation scores (factors)
The requirements of the model– The correlation between the two levels
Manual scores vs. Automatic scores
Introduction
![Page 7: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/7.jpg)
How to build the model
The manual prosody scores (an ideal case)• The “good” utterance versions (point 5)
by many native speakers of English
• The utterance versions by Korean students whose prosodic proficiencies are• High (point 5)
• Intermediate (point 3)
• Low (point 1)
• On a scale of 1 (worst) to 5 (best)
Introduction
![Page 8: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/8.jpg)
How to build the model
The automatic prosody scores• Use of Praat scripts• Comparison between a single target utterance &
multiple native speakers’ utterances to yield scores for– The F0 difference– The intensity difference– The duration difference
in the form of 3D coordinates (x, y, z) = (F0, Int, Dur)• One utterance yields as many coordinates as the
number of “good” native speakers
Introduction
![Page 9: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/9.jpg)
How to build the model
Evaluation by comparisons
Introduction
![Page 10: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/10.jpg)
A 3D sentential modelfor prosody evaluation
A 3D model– 3D axes: F0, intensity, durations
(F0, Int, Dur) coordinates= (x, y, z)
– Automatic scores as scatterplot points– Manually evaluated scores group the points
Introduction
![Page 11: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/11.jpg)
A 3D sentential modelfor prosody evaluatioin
Validity of the model– Sufficient separation of groups with different
manual scores
– colors : manual scores– arrowheads : automatic scores
Introduction
![Page 12: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/12.jpg)
Sentential prosody evaluation [7]Before & after duration manipulation
native
learnerbefore
learnerafter
Methods
![Page 13: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/13.jpg)
Sentential prosody evaluation [7]F0 : point-to-point comparison btw/ native and learner
after normalization
native
learnerafter
Methods
Automatic score (F0, Int, Dur)(x, y, z)
![Page 14: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/14.jpg)
Sentential prosody evaluation [7]Intensity : point-to-point comparison btw/ native and learner
after normalization
native
learnerafter
Methods
Automatic score (F0, Int, Dur)(x, y, z)
![Page 15: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/15.jpg)
Sentential prosody evaluation [7]Duration : segment-to-segment comparison btw/ native and learner
native
learnerbefore
Methods
P = (p1, p2, p3,..., pn) and Q = (q1, q2, q3,..., qn) in Euclidean n-dimensional space
Euclidean distance metric for evaluation measure
Automatic score (F0, Int, Dur)(x, y, z)
![Page 16: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/16.jpg)
Manual evaluation of sentential prosodyMethods
Manual scores for Set B utterances“The dancing queen likes only the apple pies”
![Page 17: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/17.jpg)
Sentential prosody evaluation [7]Methods
A sample score array for one utterance from group K5:one learner utterance vs. 10 model native utterances
Automatic prosody score for K5.U1 = {(899,142,408), (360,92,190), …(716,178,183)}
![Page 18: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/18.jpg)
A prosody evaluation modelby a Korean phonetician
Results
Korean phonetician’s Model
![Page 19: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/19.jpg)
A prosody evaluation modelby a Korean phonetician
Results
Korean phonetician’s Model
![Page 20: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/20.jpg)
A sample prosody evaluationwith a discriminant analysis
Results
![Page 21: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/21.jpg)
To make this fully automaticDiscussion
For manual evaluation of the training model– The number of Korean learners
• The more the better
– The levels of English proficiency• The diverse the better (scores 1 through 5)
For automatic evaluation of the trainees– Need automatic segmentation (ASR)– Need to deal with redundant/missing segments
![Page 22: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/22.jpg)
Building a sentential modelBuilding a sentential modelfor automatic evaluation for automatic evaluation
of pronunciation proficiencyof pronunciation proficiency
What about segmental evaluation?
Part B
![Page 23: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/23.jpg)
Segmental evaluation byspectral comparison
Methods
Sex/age controlled (no normalization was used)– Adult male (native/Korean) speakers were selected
Spectral comparison– Three equally-spaced spectral slices were used for each
matching segments– Euclidean distance measure was used from a pair of
matching spectral envelopes
Four coordinates for pronunciation proficiency evaluation– Segments, F0, intensity, durations– (w, x, y, z) becomes one of the score array
![Page 24: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/24.jpg)
Manual evaluation of overall proficiencyMethods
Manual scores for Set C utterances“Put your toys away right now”
<Table 4> The overall scores of the 34 utterances for Set C sentence “Put your toys away right now”.The manual evaluation was performed by a Korean phonetician. Note that the subjects were all male adults.
![Page 25: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/25.jpg)
A pronunciation proficiency evaluation modelby a Korean phonetician
Results
Korean phonetician’s Models
(Intensity axis not shown)
![Page 26: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/26.jpg)
A prosody evaluation modelby a Korean phonetician
Results
Korean phonetician’s Model
![Page 27: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/27.jpg)
A discriminant analysisResults
<Table 5> The classification table from the discriminant analysis of one test data.The number in each cell represents the probability of the automatic pronunciation Proficiency score being classified into the predicted group.
<Table 6> The confusion matrix for the classification table.
![Page 28: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/28.jpg)
Discriminant analyseswith leave-one-out cross-validation
Results
Testing for score 4 : 6 out of 9 correct
Testing for score 2 : 12 out of 15 correct
![Page 29: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/29.jpg)
Discriminant analyseswith leave-one-out cross-validation
Results
For N4 & K2 groups, evaluation models were built by using– The discriminant analysis with– Leave-one-out cross-validation
The number of models (built by discriminant analyses) was 24– Group N4 : 9 subjects– Group K2 : 15 subjects
Success rate– Group N4 : 6 out of 9 predicted correct– Group K2 : 12 out of 15 predicted correct
![Page 30: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/30.jpg)
Automatic evaluationof pronunciation proficiency
Discussion
Viability of sentential models for the evaluation of– Segmental proficiency : spectral comparison– Prosodic proficiency : F0/intensity/durations
in the form of multiple score array coordinates (segments, F0, intensity, durations) = (w, x, y, z)
Comparison seems to work– A target utterance vs. multiple model native utterances
Better models can be built with– More (controlled) utterances– More score resolution
• Current : score 2 (bad) – score 4 (good)
• Future : score 1 (worst) – score 3 (fair) – score 5 (best)
![Page 31: Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University 2009.06.19 Korea](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ed45503460f94be4cd5/html5/thumbnails/31.jpg)
References[1] Boersma, Paul, “Praat, a system for doing phonetics by computer”, Glot International
5(9/10), pp.341-345, 2001.[2] Mahalanobis, P.C., “On the generalized distance in statistics”, Proceedings of the National
Institute of Science of India 12, pp.49-55, 1936.[3] Moulines, E. & F. Charpentier, “Pitch synchronous waveform processing techniques for
text-to-speech synthesis using diphones”, Speech Communication 9, pp.453-467, 1990.[4] Ramus, F., M. Nespor, J. Mehler, “Correlates of linguistic rhythm in the speech signal”,
Cognition 73, pp. 265-292, 1999.[5] Rhee, S., S. Lee, Y. Lee & S. Kang, “Design and construction of Korean-Spoken English
Corpus (K-SEC)”, Malsori 46, pp.159-174, 2003.[6] Yoon, K, “Imposing native speakers' prosody on non-native speakers' utterances: The
technique of cloning prosody”, Journal of the Modern British & American Language & Literature 25(4), pp.197-215, 2007.
[7] Yoon, K. 2008. Synthesis and evaluation of prosodically exaggerated utterances. Unpublished manuscript