the utilization of subjective evaluation in the development of vocoders
DESCRIPTION
The Utilization of Subjective Evaluation in the Development of Vocoders. Evaluation Basics. Purpose Research Vocoder Development Vocoder Characterization Selection Validation Types of Conditions of Interest Baseline Acoustic Background Noise Transmission Channel Impairments - PowerPoint PPT PresentationTRANSCRIPT
ARCON CorporationJ.D. Tardelli - [email protected]
The Utilization of Subjective Evaluation in the Development of Vocoders
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 2 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Evaluation Basics• Purpose– Research
– Vocoder Development
– Vocoder Characterization
– Selection
– Validation
• Types of Conditions of Interest– Baseline
– Acoustic Background Noise
– Transmission Channel Impairments
– Talker Variability
– Signal Levels
– System Tandems
– Digital Circuit Multiplication Systems
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 3 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Subjective Testing - Control of Variables• Laboratory Factors
– Listening Environment; Audio & Electronics
• Source/Processed Recording Factors– Speech Material Factors
• Linguistic and Phonetic• Talker Factors• Transducer Selection
– Audio and Sampled Bandwidth Factors
– Acoustic Noise Material and Speech + Noise Method
• Listener Factors• Presentation Factors
– Blocking, Order and Balance
– Audio Level and Sidetone
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 4 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Associated Issues
• User Population and Face Validity
• Context– Range of Candidate Systems
– Reference and Calibration Systems
• Listen Only vs. Two-Way Methods– Delay
– Asymmetric Transmission Channels
– VoIP
• Speech material– Speech Sample length re impairment distribution
– Uniqueness, Amount Available
– Type
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 5 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Associated Issues (cont.)• Speech material (by increasing contextual content)
– Types• Scripted
– Sounds– Words
» rhyming, CVC, etc
– Sentences» meaningful, nonsense, semantically anomalous, etc
– Connected sentences– Scripts
• Scenario based– Representative of application?– Informational or Familiar– Information flow (balanced?, directional?)
• Task Based• Open
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 6 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Performance Characteristics & Test Methodology• Quality
– Diagnostic Acceptability Measure - DAM• Voiers ICASSP77
– Category Rating Tests - ACR (MOS); DCR (DMOS) CCR (CMOS)• ITU-T P.800: P.830• ITU HANDBOOK ON TELEPHONOMETRY
• IEEE Recommended Practices for Speech Quality Measures 1969
– Paired Comparison A/B Tests• David, H.A, “The Method of Paired Comparison,” Oxford
– Multi Stimulus Test with Hidden Reference and Anchor - MUSHRA• ITU-R BS.1534-1
– Speech Communication Systems with Noise Suppression Algorithms• ITU-T P.NSA
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 7 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Performance Characteristics & Test Methodology
• Speaker Recognizability– NRL Speaker Recognition Test (speakers unknown)
• Schmidt-Nielsen SCW95, ICASSP96, JASA 1985
– TNO Speaker Recognition Test (speakers known)• Steeneken & Leeuwen 1997
• Language Dependency– SRT-LD
• Wijngaarden SCW02, EuroSpeech01, Ph.D. Dissertation 2003
• Conservation of Stress State Characteristics
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 8 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Performance Characteristics & Test Methodology
• Communicability– Conversation Opinion Tests
• ITU-T P.800
– Conversational & Third Party Listen Only Tests• ITU-T P.832, P-581 (HATS)
– Continuous Quality Evaluation Method - ECQ• ITU-T P.PAC
– Arcon Communicability Exercise - ACE• Tardelli ICASSP96, NAS-NRC CHABA Symposium 1995
– TNO Communicability Test• Wijngaarden EuroSpeech01
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 9 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Performance Characteristics & Test Methodology• Intelligibility
– Modified Rhyme Test - MRT• ANSI S3.2-1989; House 1965; Kruel 1968
– Diagnostic Rhyme Test - DRT• ANSI S3.2-1989; Voiers 1973, 1987
– Consonant-Vowel-Consonant Test - CVC (AI Basis)• Fletcher ATT 1920s, JASA 1950; Allen 1994, ICASSP02; Steeneken 1992
– Speech Reception Threshold - SRT• Plomp & Mimpen 1979; Wijngaarden & Steeneken EuroSpeech99
– International Civil Aviation Org. Spelling Alphabet - ICAO • Moser & Dreher 1955; Schmidt-Nielson NRL R9035 1987, R9174 1988
– INTELTRANS -(CVC, HATS)• CELAR France MOD; J.C. Lafon 1958, 1964, 1968
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 10 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Intelligibility Measures vs. Information
Webster, 1979
ANSI S3.5-1969
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 11 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Evaluation Decisions• Purpose
• Types of Conditions
• Performance Characteristics of Importance
• Choice of Test Methodologies
• Development of Test Plan
• Selection Criteria if Selection Test
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 12 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Vocoder Development Issues• Application
– Commercial
– Strategic
– Tactical
• Diagnostic Information– Intelligibility
– Quality
– Communicability
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 13 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Low-Rate Vocoder for Tactical Use
• Harsh Acoustic Noise Environments
• Physical and Jamming Channel Issues
• LPI / LPD
• Intelligibility
• Talker Recognizability
• Conserve Stress State of Talker
• Audio Bandwidth
• Delay
• Size - Weight -Power
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 14 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Narrowband Low-Rate Vocoder IntelligibilityVocoder Intelligibility - Benign Environments
76.0
78.0
80.0
82.0
84.0
86.0
88.0
90.0
92.0
94.0
LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2
DR
T
Quiet
H250
Office
MCE
Vocoder Intelligibility - Mild Noise Environments
64.0
66.0
68.0
70.0
72.0
74.0
76.0
78.0
80.0
82.0
84.0
86.0
88.0
90.0
LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2
DR
T
E3A
SC55
P3C
F15
Vocoder Intelligibility - Severe Environments
28.0
33.0
38.0
43.0
48.0
53.0
58.0
63.0
68.0
73.0
78.0
LPC10e CELP CVSD-16 MELP MELPe2.4 MELPe1.2
DR
THMMWV
M2
CH47
Intelligibility results for current low-rate military
vocoders in acoustic background noise
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 15 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Effects of Current Noise PreprocessorsProcessing Intelligibility w/ 95% C.I.
65.0
70.0
75.0
80.0
85.0
90.0
95.0
100.0
Source NPP MELP+NPP MELP
Process
DR
T
Quiet
HMMWV
Processing Quality w/ 95% C.I.
35.0
45.0
55.0
65.0
75.0
85.0
Source NPP MELP+NPP MELP
Process
DA
M Quiet
HMMWV
Intelligibility - DRT
Quality - DAM
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 16 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Road Map to Improved DRT Intelligibility
• Inherent Distinctive Features– Jacobson, Fant, and Halle 1952; Miller & Nicely, 1955
• DRT Attributes– Voiers 1973, 1987
• DRT Attributes : Distinctive Features :Acoustic Correlates– Voiers, Benchmark Papers in Acoustics, V11 1977
• Diagnostic Capabilities of the DRT
• Cook Book
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 17 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Inherent Distinctive Features (Jacobson, Fant, and Halle 1952)• Fundamental Source Features
• Vocalic Non-Vocalic
• Consonantal Non-Consonantal
• Secondary Consonant Features
– Envelope Features
• Continuant Interrupted
• Checked Unchecked
• Strident Mellow
– Supplementary Source
• Voiced Voiceless
• Resonant Features
• Compact Diffuse
– Tonality Features
• Grave Acute
• Flat Plain
• Sharp Plain
• Tense Lax
– Supplementary Resonator
• Nasal Oral
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 18 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
DRT AttributesSTATE PRESENT ABSENT
Page Num PRESENT ABSENT Feature SubFeature V/UV Place Manner V/UV Place Manner1 1 GOB BOB EXP Voiced Velar Plosive Voiced Bilabial Plosive1 2 DAUNT TAUNT VOICING NON-FRICTIONAL Voiced Alveolar Plosive Voicless Alveolar Plosive1 3 MOOT BOOT NASALITY GRAVE Voiced Bilabial Affricate Voiced Bilabial Plosive1 4 SHEET CHEAT SUSTENTION UNVOICED Voicless Palato-Alveolar Fricative Voicless Palato-Alveolar Affricate1 5 JAB GAB SIBILATION VOICED Voiced Palato-Alveolar Affricate Voiced Velar Plosive1 6 POT TOT GRAVENESS UNVOICED Voicless Bilabial Plosive Voicless Alveolar Plosive1 7 GHOST BOAST COMPACTNESS VOICED Voiced Velar Plosive Voiced Bilabial Plosive1 8 RILL NILL EXP Voiced Palato-Alveolar Approximant Voiced Alveolar Affricate1 9 ZED SAID VOICING FRICTIONAL Voiced Alveolar Fricative Voicless Alveolar Fricative1 10 GNAW DAW NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 11 SHOES CHOOSE SUSTENTION UNVOICED Voicless Palato-Alveolar Fricative Voicless Palato-Alveolar Affricate1 12 CHEEP KEEP SIBILATION UNVOICED Voicless Palato-Alveolar Affricate Voicless Velar Plosive1 13 BANK DANK GRAVENESS VOICED Voiced Bilabial Plosive Voiced Alveolar Plosive1 14 GOT DOT COMPACTNESS VOICED Voiced Velar Plosive Voiced Alveolar Plosive1 15 NOSE ROSE EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant1 16 DINT TINT VOICING NON-FRICTIONAL Voiced Alveolar Plosive Voicless Alveolar Plosive1 17 NECK DECK NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 18 THONG TONG SUSTENTION UNVOICED Voicless Dental Fricative Voicless Alveolar Plosive1 19 CHOO COO SIBILATION UNVOICED Voicless Palato-Alveolar Affricate Voicless Velar Plosive1 20 WEED REED GRAVENESS VOICED Voiced Labio-velar Approximant Voiced Palato-Alveolar Approximant1 21 SHAG SAG COMPACTNESS UNVOICED Voicless Palato-Alveolar Fricative Voicless Alveolar Fricative1 22 KNOB ROB EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant1 23 VOLE FOAL VOICING FRICTIONAL Voiced Labio-Dental Fricative Voicless Labio-Dental Fricative1 24 NIP DIP NASALITY ACUTE Voiced Alveolar Affricate Voiced Alveolar Plosive1 25 FENCE PENCE SUSTENTION UNVOICED Voicless Labio-Dental Fricative Voicless Bilabial Plosive1 26 SAW THAW SIBILATION UNVOICED Voicless Alveolar Fricative Voicless Dental Fricative1 27 POOL TOOL GRAVENESS UNVOICED Voicless Bilabial Plosive Voicless Alveolar Plosive1 28 YIELD WIELD COMPACTNESS VOICED Voiced Palatal Approximant Voiced Labio-velar Approximant1 29 GNAT RAT EXP Voiced Alveolar Affricate Voiced Palato-Alveolar Approximant
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 19 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
DRT Attributes : Distinctive Features : Acoustic Correlates
DRT Attributes JFH Distinctive FeaturesVoicing Voiced/Voicelessharmonic content, energy at concentration at LF, long duration, low peak power
Nasality Nasal/Oralnasal formants in regions of 200, 800 and 2400 Hz
Sustention Continuant/Interruptedgradual onset > 130 msec, low level noise in MF to HF
Sibilation Strident/Mellowsustained HF noise of relatively high intensity
Compactness Compact/DiffuseLF spectral shape, low loci of 2nd and 3rd formants, dynamics of formant transitions
Graveness Grave/AcuteHF spectral shape, separation of 2nd and 3rd formants, dynamics of 2nd and 3rd formant
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 20 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Diagnostic Capabilities of the DRT• Talkers
– Male : Female
• Attribute State– Present : Absent
• Attribute Bias
• Sub-Attribute Scores
• Characteristic Attribute Profile
• Empirical Studies
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 21 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Cook Book for Improved Intelligibility
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 22 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Pitfalls in Subjective Evaluation
•Measured Intelligibility vs. Real World Intelligibility– NAS-NRC CHABA 1989 Symposium Removal of Noise From
Noise-Degraded Speech Signals
– Vocoder Tuned to DRT Words
– Vocoder based on “scripted word” characteristics that are not applicable to conversational speech.
•Danger of "self evaluation" by Vocoder Developers– Tardelli, ICASSP96, DAM vs MOS Study 1996
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 23 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
DAM vs. MOS Study
A Systematic Investigation of the Mean Opinion Score (MOS)
and the Diagnostic Acceptability Measure (DAM) for Use in the
Selection of Digital Speech Compression Algorithms
ARCON Corp. 1996
Available in DRAFT form at http://www.arcon.com/dld.html
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 24 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
P.NSA and WHY
•ETSI/3GPP AMR-NS 1999
•Exp.. 3 MMOS w/ Multi-Dimensional QuestionYou will hear speech samples reproduced in a telephone handset. Every sample consists of four short unconnected sentences in a noise environment. Your task is to indicate your opinion of the overall sound quality with respect to any unnatural sound in the sample. Please make your judgement of the sample considering unnatural sound during the complete sample.
•Resulted in Bimodal Decision
P.NSA Subjective test methodology for evaluating speech
communication systems that include noise suppression algorithm
SummaryThis document proposes a methodology for evaluating the subjective quality of speech in noise and particularly appropriate for the evaluation of noise suppression algorithms. The proposed methodology uses separate rating scales to independently estimate the subjective quality of the Speech Signal alone, the Background Noise alone, and Overall
Quality. ITU-T SG12/Q7 SQEG, Primarily Dynastat and FT
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 25 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
INTELTRANS Testbed
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 26 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
DRT Characteristic Attribute Profile2400bps MELPe DRT Results - Female
50
55
60
65
70
75
80
85
90
95
100
V N Su Si G C T
Attributes
Inte
llig
ibil
ity
BlackHaw kUH-60
M2 BradleyVehicle
MOUT
MobileCommandEnclosureOffice
Quiet
2400bps MELPe DRT Results - Male
50
55
60
65
70
75
80
85
90
95
100
V N Su Si G C T
Attributes
Inte
llig
ibil
ity
BlackHaw kUH-60
M2 BradleyVehicle
MOUT
MobileCommandEnclosureOffice
Quiet
2400bps MELPe DRT Results - Combined
50
55
60
65
70
75
80
85
90
95
100
V N Su Si G C T
Attributes
Inte
llig
ibil
ity
UH-60
UH-60Present
UH-60Absent
MCE
MCE Present
MCE Absent
2400bps MELPe DRT Results - Combined
50
55
60
65
70
75
80
85
90
95
100
V N Su Si G C T
Attributes
Inte
llig
ibil
ity
M2
M2 Present
M2 Absent
Quiet
Quiet Present
Quiet Absent
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 27 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Empirical Study of DRT Attributes vs. SNR
Band Limited Gaussian Noise
Voiers, JASA 1973
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 28 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Scripted Material - DRT Word Lists
MOOT or BOOT Voicing
SHEET or CHEAT Nasality
JAB or GAB Sustention
POT or TOT Sibilation
GHOST or BOAST Graveness
DINT or TINT Compactness
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 29 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Scripted Material - CVC Nonsense Words
MIG(RAINE)COS(T)HAYMDITTOUP(EE)BACHPOD(IUM)SEM(I)LAL:PALREAS(ON)REET:BEETSAYZ:DAYSBOD(Y)KOOMLEP(ER)PONE:BONEHIESDACK:BACKTEEG:LEAGUEMAHL
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 30 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Problems with CVC Test Implementation
• CVC Corpus Balance
– Talker by Word by Environment
– Word by Distinctive Feature by Lexicon
• Regional Dialectic Differences
– New England• Spoken “COT” = “CAUGHT”
• Perception Midwest “CART” = “COT”
• Test Design
– Uniqueness for Talker By Word by Environment by Process
– Balance Across Distinctive Feature by Process
– Balance Across Subject by Stimulus
– Sufficient Subjects for Reasonable Resolution
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 31 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Diagnostic Capabilities of INTELTRANS
Five acoustic indices are sufficient to characterize all vowels :sharp / lowdiffuse / compactextreme / no extremeflatten / sharpennasal / no nasal
.picture 2 : The French vocalic system (from L.J. Boê et al.)
The Utilization of Subjective Evaluation in the Development of Vocoders 11/2003
Slide 32 of 22
ARCON CorporationJ.D. Tardelli - [email protected]
Diagnostic Capabilities of INTELTRANS (cont.)Seven acoustic indices sufficient to determine all the consonants :
voice / unvoiceinterrupted / no interruptedvocalic / no vocalicsharp / lowcontinuous / discontinuousdiffuse / compactnasal / no nasal
picture 3 : the French consonantic system (from L.J. Boê et al.)