ee679-overview2015
DESCRIPTION
sppechTRANSCRIPT
-
7/23/2015
1
1Department of Electrical Engineering , IIT Bombay
EE679EE679: Speech Processing : Speech Processing
A previewA preview
EE679EE679: Speech Processing : Speech Processing
A previewA preview
Dept of Electrical EngineeringI.I.T. Bombay
2Department of Electrical Engineering , IIT Bombay
Why does signal processing for speech need a special course?
Signal processing is concerned with the mathematicalrepresentation of the signal and the algorithmicoperations carried out to modify the signal or to extractinformation from it.
The representation and the algorithms are applicationdomain specific, i.e. there are no generic methods.
An understanding of the signal and of the application arecrucial to the success of the signal processing methods
-
7/23/2015
2
3Department of Electrical Engineering , IIT Bombay
Everyday speech technology
Mobile telephony
Automatic speech recognition (speech to text)
Speech synthesis (text to speech)
4Department of Electrical Engineering , IIT Bombay
Understanding speech communication
-
7/23/2015
3
5Department of Electrical Engineering , IIT Bombay
Acoustic wavesSpeed = wavelength x frequency
6Department of Electrical Engineering , IIT Bombay
Information in speech?
Linguistic (message -> sentences -> words -> phonemes)
The speech signal is characterised by an enormous range of elementary perceptually contrasting sounds!
Paralinguistic: --expressive (emotions, mood)--speaker-based (age, gender, accent and style)
-
7/23/2015
4
7Department of Electrical Engineering , IIT Bombay
Generating speech*
Respiration->phonation->articulation
Vibrating vocal cords create puffs of air giving rise to air pressure variations which reach our ears.
*HyperPhysics, Sound and Hearing, Georgia State University
8Department of Electrical Engineering , IIT Bombay
Speech production (Childers, Speech Overview, 1993)
-
7/23/2015
5
9Department of Electrical Engineering , IIT Bombay
.......;45;
43;
4 321 Lcf
Lcf
Lcf
Vocal tract: Acoustic resonances*
*HyperPhysics, Sound and Hearing, Georgia State University
(http://hyperphysics.phy-astr.gsu.edu/hbase/sound/)
10Department of Electrical Engineering , IIT Bombay
-
7/23/2015
6
11Department of Electrical Engineering , IIT Bombay
Vocal cords
Tongue Jaw
Lips
Teeth
Velum
Moving muscles which alter the resonant cavities Static cavity
Dynamic cavity
Vocalcavity
Pharyngeal
cavity
Velum
Nasal cavity
Oral Cavity
Articulators
Trachea connection to lungs
Oral sound output
Nasal sound output
Articulation: producing the various sounds of speech*
*Securivox tutorial
12Department of Electrical Engineering , IIT Bombay
The sound spectrum is modified by the shape of the vocal tract. The resonant frequencies of the vocal tract cause peaks in the spectrum called formants.
Vocal tract filter*
*Childers, Speech Overview
-
7/23/2015
7
13
Von Kempelen's talking machine
1791
"Briefly, the device was operated in the following manner. The right arm rested on the main bellows
14
1875
Alexander Bell invents the method of, and apparatus for, transmitting vocal or other sounds telegraphically ... by causing electrical undulations, similar in form to the vibrations of the air accompanying the said vocal or other sound.
=> Major impetus to modern speech processing.
1930s: Electrical synthesis of speech by Dudleys vocoder
Department of Electrical Engineering , IIT Bombay
-
7/23/2015
8
15Department of Electrical Engineering , IIT Bombay
Sound -> electrical form*
*The Physics Classroom:http://www.glenbrook.k12.il.us/gbssci/phys/Class/sound/u11l2a.html
16
Speech waveform
Department of Electrical Engineering , IIT Bombay
-
7/23/2015
9
17Department of Electrical Engineering , IIT Bombay
Speech Waveforms from my speech
(b) ee vowel
(c) s consonant
(a) start of y vowel
18Department of Electrical Engineering , IIT Bombay
T0 = 3.3 msec
T0 = 10 msec
low pitch tone
high pitch tone
Frequency (Fo) = 1/To= 100 Hz
Frequency = 300 Hz
Air
pres
sure
var
iation
1 Hertz = 1 vibration/sec
-
7/23/2015
10
19Department of Electrical Engineering , IIT Bombay
Components of sound
A sound is usually comprised of several frequency components.
Depending on the relationships of the frequency components, the sound can elicit a sensation of pitch.
20Department of Electrical Engineering , IIT Bombay
300 Hz
600 Hz
900 Hz
300 Hz + 600Hz
300 Hz + 600Hz + 900Hz
-
7/23/2015
11
21Department of Electrical Engineering , IIT Bombay
Classification of speech sounds
Vowels and Consonants
Vowels: steady sounds specified by position of the articulators (typically, tongue)
Consonants: are (dynamic) sounds classifiedby place and manner of articulation
22Department of Electrical Engineering , IIT Bombay
Place of articulation(constriction of vocal tract)
-
7/23/2015
12
23Department of Electrical Engineering , IIT Bombay
Basic sounds of speech: Phones
The speech signal can be divided into sound segments with fixed articulation and acoustics over short intervals.i.e. articulatory configuration acoustic properties
Smallest meaningful sound unit: phone (i.e. set of distinctive sounds of a language)
In Indian written scripts, one symbol represents one phone.
24Department of Electrical Engineering , IIT Bombay
-
7/23/2015
13
25
PRAAT examples
Department of Electrical Engineering , IIT Bombay
26
Physiology (articulator motion)
Sound with specific acoustic characteristics (seen in waveform and spectrum)
Perception of certain sound qualities
Department of Electrical Engineering , IIT Bombay
-
7/23/2015
14
27Department of Electrical Engineering , IIT Bombay
Speech production basics
Vocal cords (larynx) modulate the airflow from the lungs by rapid opening-closing; the rate of vibration is determined by their mass and tension. Pitch frequency ranges:male: 80-160 Hz; female:160-320 Hz; singers: over 2 octaves.
Vocal tract shapes the vocal cord vibrations into the intricate sounds of speech via changes in shape to produce various acoustic resonances.
28Department of Electrical Engineering , IIT Bombay
-
7/23/2015
15
29
Glottal folds in action
Department of Electrical Engineering , IIT Bombay
30Department of Electrical Engineering , IIT Bombay
Outline
Speech production (physiology)
Classification of sounds: articulatory, acoustic
Speech analysis (signal processing methods for information extraction)
Hearing, and speech perception
Speech technology (speech compression, ASR,TTS)
Audio/music technology
-
7/23/2015
16
31Department of Electrical Engineering , IIT Bombay
Text / References
Douglas O'Shaughnessy, Speech Communications: Human and Machine, Universities Press (India) Ltd., 2001
Rabiner and Schafer, Digital Processing of Speech Signals
IITB Moodle for all course-related hand-outs
32Department of Electrical Engineering , IIT Bombay
Evaluation
Computing assignments (Python preferred)
Exams: mid semester, end semester