ee679 : speech processingdaplab/courses/ee679-overview20… · 10 von kempelen's talking...
TRANSCRIPT
![Page 1: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/1.jpg)
1Department of Electrical Engineering , IIT Bombay
EE679EE679: Speech Processing : Speech Processing
A previewA preview
EE679EE679: Speech Processing : Speech Processing
A previewA preview
Dept of Electrical Engineering
I.I.T. Bombay
![Page 2: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/2.jpg)
2Department of Electrical Engineering , IIT Bombay
Outline
• Speech production (physiology)
• Classification of sounds: articulatory, acoustic
• Speech analysis (signal processing methods for information extraction)
• Hearing, and speech perception
• Speech technology (speech compression, ASR,TTS)
• Audio/music technology
![Page 3: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/3.jpg)
3Department of Electrical Engineering , IIT Bombay
Speech communication
![Page 4: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/4.jpg)
4Department of Electrical Engineering , IIT Bombay
Acoustic waves
Speed = wavelength x frequency
![Page 5: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/5.jpg)
5Department of Electrical Engineering , IIT Bombay
Information in speech
• Linguistic (phone->word->sentence->message)
• Paralinguistic:
--speaker-based (pronunciation, age, sex,etc.),
--expressive (emotions, mood)
The speech signal is characterised by an enormous
range of perceptually contrasting sounds!
![Page 6: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/6.jpg)
6Department of Electrical Engineering , IIT Bombay
Generating speech*
Respiration->phonation
->articulation
Vibrating vocal cords
create puffs of air giving
rise to air pressure
variations which reach
our ears.*HyperPhysics, Sound and
Hearing, Georgia State
University
![Page 7: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/7.jpg)
7Department of Electrical Engineering , IIT Bombay
.......;4
5;
4
3;
4321
L
cf
L
cf
L
cf ===
Vocal tract: Acoustic resonances*
*HyperPhysics, Sound and
Hearing, Georgia State University
(http://hyperphysics.phy-
astr.gsu.edu/hbase/sound/)
![Page 8: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/8.jpg)
8Department of Electrical Engineering , IIT Bombay
Speech production (Childers, Speech Overview, 1993)
![Page 9: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/9.jpg)
9Department of Electrical Engineering , IIT Bombay
Vocal cords
Tongue Jaw
Lips
Teeth
Velum
Moving muscles
which alter the
resonant cavities Static cavity
Dynamic cavity
Vocal
cavity
Pharyngeal
cavity
Velum
Nasal
cavity
Oral
Cavity
Articulators
Trachea connection to lungs
Oral sound output
Nasal sound output
Articulation: producing the various sounds of speech*
*Securivox
tutorial
![Page 10: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/10.jpg)
10
Von Kempelen's talking machine
1791
"Briefly, the device was operated in the following manner. The right arm rested on the main bellows and
expelled air though a vibrating reed to produce voiced sounds." (This is illustrated in the lower half of the
figure). "The fingers of the right hand controlled the air passages for the fricatives /sh/ and /s/, as well as the
'nostril' openings and the reed on-off control. For vowel sounds, all the passages were closed and the reed
turned on. Control of vowel resonances was effected with the left hand by suitably deforming the leather
resonator at the front of the device. Unvoiced sounds were produced with the reed off, and by a turbulent flow
through a suitable passage. In the original work, von Kempelen claimed that approximately 19 consonant
sounds could be made passably well.” Flanagan, Speech Analysis, Synthesis and Perception, 166-167.
![Page 11: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/11.jpg)
11
1875
• Alexander Bell invents the method of, and apparatus for,
“transmitting vocal or other sounds telegraphically ... by causing
electrical undulations, similar in form to the vibrations of the air
accompanying the said vocal or other sound”.
=> Major impetus to modern speech processing.
• 1930s: Electrical synthesis of speech by Dudley’s vocoder
Department of Electrical Engineering , IIT Bombay
![Page 12: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/12.jpg)
12Department of Electrical Engineering , IIT Bombay
Sound -> electrical form*
*The Physics Classroom:http://www.glenbrook.k12.il.us/gbssci/phys/Class/sound/u11l2a.html
![Page 13: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/13.jpg)
13
Speech “waveform”
Department of Electrical Engineering , IIT Bombay
![Page 14: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/14.jpg)
14Department of Electrical Engineering , IIT Bombay
Speech Waveforms from “my speech”
(b) “ee” vowel
(c) “s” consonant
(a) start of “y” vowel
![Page 15: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/15.jpg)
15Department of Electrical Engineering , IIT Bombay
Components of sound
A sound is usually comprised of several frequency
components.
Depending on the relationships of the frequency
components, the sound can elicit a sensation of pitch.
![Page 16: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/16.jpg)
16Department of Electrical Engineering , IIT Bombay
Speech production
• Vocal cords (larynx) modulate the airflow from the
lungs by rapid opening-closing; the rate of vibration is
determined by their mass and tension.
Pitch frequency ranges:
male: 80-160 Hz; female:160-320 Hz;
singers: over 2 octaves.
• Vocal tract shapes the vocal cord vibrations into the
intricate sounds of speech via changes in shape to
produce various acoustic resonances.
![Page 17: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/17.jpg)
17Department of Electrical Engineering , IIT Bombay
• The sound spectrum is modified by the
shape of the vocal tract.
• The resonant frequencies of the vocal
tract cause peaks in the spectrum called
formants.
Vocal tract “filter”*
*Childers, Speech Overview
![Page 18: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/18.jpg)
18Department of Electrical Engineering , IIT Bombay
Most important aspects of speech…
• The intelligence in speech is encoded in the power
spectrum of the acoustic pressure wave.
• Different articulatory configurations result in signals
with different spectra, esp. different resonance
frequencies called formants, which are perceived as
different sounds.
• The different spectra make up the finite alphabet of
symbols (linguistic code) governed by a hierarchy of
linguistic rules.
![Page 19: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/19.jpg)
19Department of Electrical Engineering , IIT Bombay
Basic sounds of speech: Phones
• The speech signal can be divided into sound segments
with fixed articulation and acoustics over short intervals.
i.e. articulatory configuration <=> acoustic properties
Smallest meaningful sound unit: “phone”
(i.e. set of distinctive sounds of a language)
In Indian written scripts, one symbol represents one phone.
![Page 20: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/20.jpg)
20Department of Electrical Engineering , IIT Bombay
Classification of speech sounds
Vowels and Consonants
• Vowels: steady sounds specified by position of the articulators (typically, tongue)
• Consonants: are (dynamic) sounds classified
by place and manner of articulation
![Page 21: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/21.jpg)
21Department of Electrical Engineering , IIT Bombay
Place of articulation
(constriction of vocal tract)
![Page 22: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/22.jpg)
22Department of Electrical Engineering , IIT Bombay
![Page 23: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/23.jpg)
23Department of Electrical Engineering , IIT Bombay
“my speech”
Dark areas of spectrogram
show high intensity
– Voiced segments are much
louder than unvoiced
– Horizontal dark bands are the
formant peaks
– “s” has high frequency content
– Vertical bands are individual
larynx closures
– The “y” of “my” is a diphthong:
two successive vowels
“Decoding” the speech signal: visible speech
![Page 24: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/24.jpg)
24Department of Electrical Engineering , IIT Bombay
![Page 25: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/25.jpg)
25Department of Electrical Engineering , IIT Bombay
Machli jal ki hai raani jeevan uska he paani
![Page 26: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/26.jpg)
26Department of Electrical Engineering , IIT Bombay
Indian costumes are quite colourful
![Page 27: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/27.jpg)
27Department of Electrical Engineering , IIT Bombay
Speech perception
Distinct stages of physiological processing
in the auditory system:
Peripheral auditory system (Ears) � analysis
Auditory nervous system (Brain) �synthesis
![Page 28: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/28.jpg)
28Department of Electrical Engineering , IIT Bombay
Audible sound
![Page 29: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/29.jpg)
29Department of Electrical Engineering , IIT Bombay
Sound and Sensation
A sound of given frequency components and sound pressure levels leads to perceived sensations that can be distinguished in terms of:
– loudness <-- intensity
– pitch <-- fundamental frequency
– timbre (“quality” or “colour”)
<--ther spectro-temporal properties
![Page 30: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/30.jpg)
30Department of Electrical Engineering , IIT Bombay
Our auditory apparatus
Cochlea:
Ear’s microphone
HyperPhysics, Sound and Hearing, Georgia State University
(http://hyperphysics.phy-
astr.gsu.edu/hbase/sound/soucon.html#soucon)
![Page 31: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/31.jpg)
31Department of Electrical Engineering , IIT Bombay
Basilar Membrane
Location-dependent frequency “resonance”
•Thickness and tension
vary along its length
•Traveling wave has
maximum vibration
amplitude at a location
depending on its frequency
![Page 32: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/32.jpg)
32Department of Electrical Engineering , IIT Bombay
Basilar Membrane
Frequency-to-place transformation (Fourier analysis)
HyperPhysics, Sound and Hearing, Georgia State University
(http://hyperphysics.phy-
astr.gsu.edu/hbase/sound/soucon.html#soucon)
![Page 33: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/33.jpg)
33Department of Electrical Engineering , IIT Bombay
Applications
• Automatic speech recognition/ understanding
• Text-to-speech synthesis
• Speaker verification (biometric)
• Digital storage/transmission of speech
• Aids to the handicapped
• Enhancement of quality
![Page 34: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/34.jpg)
34Department of Electrical Engineering , IIT Bombay
Transmission/storage
Waveform coding:
distortion vs bit rate
What distortion is
“acceptable” depends on
the application and on
human perception.
![Page 35: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/35.jpg)
35Department of Electrical Engineering , IIT Bombay
Digital audio bit rates: Waveform coding
Format Sample Rate
(kHz)
Bits/sample
Telephony 8 12 (=> 96 kbps)
Wideband audio 16 16
Hi-fidelity audio 44.1 16
![Page 36: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/36.jpg)
36Department of Electrical Engineering , IIT Bombay
Source-filter model parameters
Pitch and vocal tract shape vary slowly in time
![Page 37: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/37.jpg)
37Department of Electrical Engineering , IIT Bombay
Frame-based coding of speech
![Page 38: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/38.jpg)
38Department of Electrical Engineering , IIT Bombay
Automatic speech recognition
• To extract the linguistic code (a structured
sequence of discrete symbols) from an analysis of the acoustic speech signal.
• That is, continuous, noisy measurements of a non-stationary function of time only are available.
![Page 39: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/39.jpg)
39Department of Electrical Engineering , IIT Bombay
Automatic speech recognition
• Feature calculation (to a more distinctive domain)
• Pattern classification with respect to previously
trained models of phones/words
• Improved transcription based on language model
![Page 40: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/40.jpg)
40Department of Electrical Engineering , IIT Bombay
*K.Samudravijaya, A Tutorial on
Speech and Speaker Recognition
ASR: block diagram*
![Page 41: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/41.jpg)
41Department of Electrical Engineering , IIT Bombay
ASR: Challenges
• Inter- and intra-speaker variations
• Effects of coarticulation in continuous speech
• Background noise and variable channels
![Page 42: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/42.jpg)
42
Categories of speech recognition tasks
Human to machine:
• Database query/ information retrieval
• Dictation
Human to human:
• Broadcast news
• Lectures
• Voice mail
• Meeting
• Telephone conversation
Department of Electrical Engineering , IIT Bombay
![Page 43: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/43.jpg)
43Department of Electrical Engineering , IIT Bombay
Speaker recognition
(voice-based biometric)
• The voice signal is considered relatively easy to
acquire/collect.
• Speech enables an (indirect) measurement of
physiological features (i.e. characteristics of the
speaker’s voice production system).
• Applications:
Commercial (access control, segmentation)
Military, Forensic
![Page 44: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/44.jpg)
44Department of Electrical Engineering , IIT Bombay
What: To convert a text string into a speech waveform
Why: For technology to communicate when a display would
be inconvenient.
Speech Synthesis
![Page 45: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/45.jpg)
45Department of Electrical Engineering , IIT Bombay
Basic TTS System
Prosody => A phone is long/short, loud/soft, high/low-pitched
![Page 46: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/46.jpg)
46Department of Electrical Engineering , IIT Bombay
Outline
• Speech production (physiology)
• Classification of sounds: articulatory, acoustic
• Speech analysis (signal processing methods for information extraction)
• Hearing, and speech perception
• Speech technology (speech compression, ASR,TTS)
• Audio/music technology
![Page 47: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/47.jpg)
47Department of Electrical Engineering , IIT Bombay
Text / References
• Douglas O'Shaughnessy, Speech Communications: Human and Machine, Universities Press (India) Ltd., 2001
• Rabiner and Schafer, Digital Processing of Speech Signals
• IITB Moodle for all course-related hand-outs
![Page 48: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/48.jpg)
48Department of Electrical Engineering , IIT Bombay
Recognition: “Vowel triangle”
![Page 49: EE679 : Speech Processingdaplab/courses/ee679-overview20… · 10 Von Kempelen's talking machine 1791 "Briefly, the device was operated in the following manner. The right arm rested](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed557a9cfcb033b55255e25/html5/thumbnails/49.jpg)
49Department of Electrical Engineering , IIT Bombay
Speaker variability: due to differences in vocal physiology