vogler and metaxas university of toronto computer science csc 2528: handshapes and movements:...
TRANSCRIPT
![Page 1: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/1.jpg)
Vogler and Metaxas
University of Toronto Computer Science
CSC 2528: Handshapes and Movements: Multiple-channel ASL recognition
Christian Vogler and Dimitris Metaxas(presented by Christopher Collins)
![Page 2: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/2.jpg)
University of Toronto Computer ScienceVogler and Metaxas 2
Overview: Part II Introduction to ASL recognition Challenges of ASL recognition Related work Modelling
Phoneme-based modelling Independent Channels Handshape
Parallel Hidden Markov Models Experiments Conclusions and Future Work
![Page 3: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/3.jpg)
University of Toronto Computer ScienceVogler and Metaxas 3
ASL Recognition: Introduction
Computer interaction is still mainly keyboard/mouserequires literacy in a written language or
an agreed-upon standard written form of ASL (e.g. sign-writing)
difficult for many people who are deaf
![Page 4: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/4.jpg)
University of Toronto Computer ScienceVogler and Metaxas 4
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous events
![Page 5: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/5.jpg)
University of Toronto Computer ScienceVogler and Metaxas 5
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous eventsinflections
![Page 6: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/6.jpg)
University of Toronto Computer ScienceVogler and Metaxas 6
ASL Recognition: Challenges
More difficult than speech recognition due to:simultaneous eventsinflectionsphonology poorly understood, no
agreed standard
![Page 7: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/7.jpg)
University of Toronto Computer ScienceVogler and Metaxas 7
Challenges of Simultaneity
![Page 8: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/8.jpg)
University of Toronto Computer ScienceVogler and Metaxas 8
Related Work
C. Vogler and D. Metaxas. Parallel Hidden Markov Models for ASL Recognition (1999).
G. Fang et al. Signer-independent continuous sign language recognition based on SRN/HMM (2001).
R.-H. Liang and M. Ouhyoung. A real-time continuous gesture recognition system for sign language (1998).
![Page 9: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/9.jpg)
University of Toronto Computer ScienceVogler and Metaxas 9
Overview
HMM-based approach to ASL recognitionparallel HMMs for different channelschannels are left and right handshape and
movementuses the movement-hold phonology
![Page 10: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/10.jpg)
University of Toronto Computer ScienceVogler and Metaxas 10
Movement-Hold Example
![Page 11: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/11.jpg)
University of Toronto Computer ScienceVogler and Metaxas 11
Handshape Modelling Most previous work uses
joint and abduction angles as features (low-level)
Also experiment with a measure of the openness of a finger (high level) height and width of
quadrilateral MPJ angle abduction angles
![Page 12: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/12.jpg)
University of Toronto Computer ScienceVogler and Metaxas 12
Extensions to HMM
Regular HMM model one process evolving over time
To model parallel, possibly interacting processes with a regular HMM, events must evolve in lockstep
Earlier work by Vogler and Metaxas explains development of parallel HMM model
![Page 13: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/13.jpg)
University of Toronto Computer ScienceVogler and Metaxas 13
Factorial HMM
![Page 14: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/14.jpg)
University of Toronto Computer ScienceVogler and Metaxas 14
Coupled HMM
![Page 15: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/15.jpg)
University of Toronto Computer ScienceVogler and Metaxas 15
Parallel HMM
![Page 16: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/16.jpg)
University of Toronto Computer ScienceVogler and Metaxas 16
Combination of Processes
Using independence assumption, combine path probabilities (from each channel, with states representing the same sign sequence) by multiplying them. Choose the most probable state sequence.
Time is polynomial in number of states, linear in number of parallel processes
More info: C. Vogler and D. Metaxas, Parallel Hidden Markov Models for ASL Recognition; Proc. Int. Conf. on Comp. Vis., Greece, 1999.
![Page 17: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/17.jpg)
University of Toronto Computer ScienceVogler and Metaxas 17
Experiments
Compare handshape models (joint angles vs. quadrilateral) for handshape recognition task
Compare PaHMM model with various channel combinations against single hand movement channel (naïve baseline?)
Vocabulary of 22 signs, 400 training sentences of length 2-7 signs, and 99 test sentences
Omitted left-hand handshape?
![Page 18: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/18.jpg)
University of Toronto Computer ScienceVogler and Metaxas 18
Choice of Handshape Model
Measure correctly recognized handshape (recognizing signs with handshape alone not possible)
Quadrilateral feature vector results in better (and more consistent) recognition accuracy
![Page 19: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/19.jpg)
University of Toronto Computer ScienceVogler and Metaxas 19
Experimental Results
H=correct, D = deletion, S = substitution, I = insertion, N = number
![Page 20: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/20.jpg)
University of Toronto Computer ScienceVogler and Metaxas 20
Conclusions
Handshape information is important in ASL recognition
Parallel HMM a promising model for multi-channel data
![Page 21: Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris](https://reader035.vdocument.in/reader035/viewer/2022062511/551acd94550346856e8b5df9/html5/thumbnails/21.jpg)
University of Toronto Computer ScienceVogler and Metaxas 21
Future Work
Training/Test data from native signers Include facial expressions Use of relative spatial information (classifiers) Larger vocabulary
Incorporation of language modelling to improve recognition, such as n-gram or parsing