![Page 1: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/1.jpg)
Sign Language Representation for
Machine Translation
Sara MorrisseyNCLT/CNGL Seminar Series
1st April, 2009
![Page 2: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/2.jpg)
Why is there no writing system?
• Social reasons• Variation and demographic spread
• Political reasons• Recognition
• Linguistic reasons• Visual-gestural-spatial languages, simultaneous
phoneme production
![Page 3: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/3.jpg)
Implications of the lack of writing system
• …for Deaf people• Forced use language not native
• …for the languages• social acceptance standardisation (Pizzuto, 2006)
• … for MT• Limits availability of domain-specific corpora• No standards, difficult to compare systems• Significance of results on small datasets• Difficult to use NLP tools developed for spoken langs
![Page 4: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/4.jpg)
Sign Language Representation Formats
• Linear• Stokoe Notation, HamNoSys
• Multi-level• Gloss, Partition/Constitute, Movement-
Hold, SiGML
• Iconic• SignWriting
![Page 5: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/5.jpg)
Linear Symbolic Notations
Stokoe Notation: “don’t know”
HamNoSys Notation: “nineteen”
![Page 6: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/6.jpg)
Multi-level Representations
<?xml version="1.0"encoding="iso-8859-1"?><!DOCTYPE sigml SYSTEM "http://..."><sigml><hamgestureal sign gloss="going to DGS"><sign manual both hands="true"><handconfig handshape="finger2“ thumbpos="out"/><handconfig extfidir="uo“ palmor="1"/>
Movement-Hold
Partition/Constitute
Gloss Annotation SiGML
![Page 7: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/7.jpg)
Iconic
Sign Writing
![Page 8: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/8.jpg)
But different groups, different requirements
(Pizzuto et al, 2006):
the aspect of a language chosen for its representation, is largely dictated by the society and culture developing the writing system and what purpose and settings such communication is required for.
Deaf, linguists, language processors…
![Page 9: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/9.jpg)
Requirements for MT
• large bilingual domain-specific corpus of good quality digital data
• gold standard reference• segmentation algorithms for separating
words, phrases and sentences • alignment methodologies for these units. • searching the source and target texts• acceptable capturing of the language for
output
![Page 10: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/10.jpg)
Discussion of current methods
• Stokoe (Stokoe, 1960)– Difficult to capture classifiers and NMFs– Decontextualised signs only– ASCII version (Mandel, 1993)
• HamNoSys (Prillwitz, 1989)– NMFs included– Subsection of 150 symbols for handwriting purposes– Mac usage, Windows font
![Page 11: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/11.jpg)
Discussion of current methods (2)
• Gloss Annotation: (Leeson et al., 2006, Neidle et al., 2002)
– Most commonly used in MT and by linguists– No universal conventions– Extensible– Using one language to describe another– Allows for simultaneous timed logging of features– Tools widely available– SL and linguistic knowledge a requirement– No knowledge of supplementary symbolic system
required
![Page 12: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/12.jpg)
Discussion of current methods (3)
• Partition/Constitute (Huenerfauth, 2005)– Captures movement, classifier and spatial info– Comprehensive, hierarchical rep’n– Implicit use of gloss terms
• Movement-Hold (Liddell & Johnson, 1989)– Numerically-encoded handshapes– Multi-layer– Used with recognition technology (Vogler & Metaxas,
2004)
![Page 13: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/13.jpg)
Discussion of current methods (4)
• SiGML (Elliott et al., 2004)– Describes HamNoSys for animation (ViSiCAST)– Double representation
• SignWriting (Sutton, 1995)– Compact icons– Information displayed in one place– Advocated by SL linguists and growing Deaf– Not currently machine readable
![Page 14: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/14.jpg)
Worked Example
• “Data-driven Machine Translation for Sign Languages” (Morrissey, 2008)
• MaTrEx MT system• Glossed Annotations of Irish Sign Language
(ISL) and German Sign Language (DGS)• Air Traffic Information System corpus of ~600
sentences• Translated and signed by native Deaf signers
![Page 15: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/15.jpg)
Hand-crafted gloss annotation corpus
![Page 16: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/16.jpg)
Translation Directions
![Page 17: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/17.jpg)
MaTrEx Experiments
• ISL gloss-to-English text– Baseline– SMT– EBMT 1 – EBMT 2– Distortion limit
![Page 18: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/18.jpg)
ISL-EN MaTrEx Experiments
BLEU WER PER
Annotation Baseline
25.20 60.31 50.42
SMT 51.63 39.32 29.79
EBMT 1 50.69 37.75 30.76
EBMT 2 49.76 39.92 32.44
![Page 19: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/19.jpg)
EN-ISL MaTrEx Experiments
BLEU WER PER
ISL-EN best scores
52.18 38.48 39.67
SMT 38.85 46.02 34.33
EBMT 1 39.11 45.90 34.20
EBMT 2 39.05 46.02 34.21
![Page 20: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/20.jpg)
Other experiments
• ISLDE, DGSDE, DGSEN– ISL EN best scores, by 6.38% BLEU– EBMT 1 chunks improves for ISL-DE only– EBMT 2 chunks improves for ISL-DE only
• DEISL, DEDGS, ENDGS– ENDGS best scores, by 1.3% BLEU– EBMT 1 chunks improves for ENDGS & ENISL– EBMT 2 chunks improves for all
• Comparison with RWTH system– We’re better! ~2-6% BLEU
• ISL video recognition• Speech output
![Page 21: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/21.jpg)
ISL Animation
• Poser software• Hand-crafted 66
videos, 50 sentences• Played in sequence• 4 Deaf evaluators• 2 x 4-point scale• 82% - intelligibility• 72% - fidelity• Questionnaire
Demo
![Page 22: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/22.jpg)
Thesis Conclusions
• Good results can be obtained• Glossing most appropriate, but not going forward
– Allowed linguistic-based alignment– Linear, easily accessible format– Lack of NMF detail, time-consuming, not considered
adequate representation of language
• EBMT chunks show potential but require more development
• Development of animation module
![Page 23: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/23.jpg)
Where do we go from here?(the words are coming out all weird…)
• What is the most appropriate SL representation for MT?– Adequately represents the language,– Animation production, – Facilitates the translation process.
![Page 24: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/24.jpg)
Rep’n overview, redux
• Glossing: machine readable, doesn’t adequately represent the language or facilitate animation
• Stokoe: ASCII version, not adequate rep’n• Partition/Constitute: multi-layered, uses glosses• Movement-Hold: multi-layered, uses glosses• Sign Writing: compact icons, accepted, potential
readability, not machine readable at present• …• HamNoSys & SiGML: machine readable,
comprehensive description, adapted for animation, suited to SMT
![Page 25: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/25.jpg)
The Future…
• Explore HamNoSys in practice
• MT in medical domain, Health Ireland Partner GP work group questionnaire
• Human Factors
• Minority Language MT
![Page 26: Sign Language Representation for Machine Translation Sara Morrissey NCLT/CNGL Seminar Series 1 st April, 2009](https://reader030.vdocument.in/reader030/viewer/2022032522/56649d635503460f94a46392/html5/thumbnails/26.jpg)
Thank you for listening
Yep, it’s the end!
I hope it wasn’t too long
Any questions?