![Page 1: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/1.jpg)
Understand Localization Standards and Use Them Effectively
John Watkins, President, ENLASO [email protected]
![Page 2: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/2.jpg)
Agenda
• The Standards Universe • Core Standards • Using Standards
![Page 3: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/3.jpg)
The Universe: Standards Evolution
![Page 4: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/4.jpg)
Standards Definitions
• De facto Standards – influence through prevalence. – Standards may evolve from de facto standards through the cooperation of the
industry and a relevant standards body. • Standards
– Remove barriers for the purpose of performing functions that are within an industry
– Are approved and maintained by neutral third parties – Have input from industry to avoid being locked into a proprietary solution
• Open Standards – Do the above, but are publicly available
(with open access rights) – Natural coordination with open source software – Luckily the core localization standards are
Open Standards
![Page 5: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/5.jpg)
Standards Benefits
• Standards help tools to work together – Eases exchange of information among tools – Freedom to work with a wide variety of tools – Processes are developed independent of the tools
• Customer files • The right linguists
• Consequently – Tools are not constrained – Workflow is easier – Projects can be faster, better, and cheaper
![Page 6: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/6.jpg)
Standards Information
• Standards management span various organizations – OSCAR/LISA -> Disbanded
• Standards developed by OSCAR under LISA now under the Creative Commons Attribution license – See GALA below
• European Telecommunications Standards Institute (ETSI) Localization Industry Standards (LIS) Industry Specification Group as the successor for the LISA/OSCAR portfolio (TMX, TBX, SRX…): http://goo.gl/y4JgF
– GALA Open Standards Initiative • OSCAR standards: http://www.gala-global.org/standards/ • Coordination: ETSI , POASIS, Unicode Consortium, ISO TC 37 • Linport project (open format translation packages): http://www.linport.org/ • QT Launchpad – flexible quality metrics for human and machine translation • Tools Corner
– OASIS (XLIFF, DITA…): http://www.oasis-open.org/standards – W3C (ITS, MultilingualWeb-LT – ITS 2.0): http://www.w3.org/
![Page 7: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/7.jpg)
Core Standards
• Four Standards (three areas) that are open, stable, and work well: – Translation memories
• TMX: Translation Memory eXchange1 Easily exchange of translation memory among tools
– Segmentation • SRX: Segmentation Rules eXchange1
Provide a standard method to describe segmentation rules for TMs that are being exchanged among tools
– Extracted data • ITS: Internationalization Tag Set2
Used for XML to support the internationalization and localization of XML schemas and documents (XML, HTML5)
• XLIFF: XML Localisation Interchange File Format3 To store localizable content and carry it from one step of the localization process to the other, while allowing interoperability among tools
1 See GALA Open Standards: http://www.gala-global.org/lisa-oscar-standards 2 See W3C: http://www.w3.org/TR/its/ 3 See OASIS: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xliff
![Page 8: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/8.jpg)
Using Standards
• Look at an example project • See the standards involved • Use standards to provide localized files
![Page 9: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/9.jpg)
Using Standards – Open Source
• Open Standards fit with Open Source • We work with the Okapi Framework Project1 • You can use the Okapi Framework to:
1. Manipulate and combine translation memories 2. Extract text with appropriate filters 2. Edit segmentation rules and apply them to content 2. Leverage from TM 2. Machine translate unmatched text 2. Create the translation package for the linguists 3. Rebuild translated files
1 See Okapi Framework project site at: http://code.google.com/p/okapi
![Page 10: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/10.jpg)
WordfastTM
TMX 1
TMX 2
Trados TM
SRX Rules
HTML File
MIF File
Translation Memory from Trados
Translation Memory from Wordfast
Segmentation rules for the TMs
New version of the documents to translate (from HTML5 and FrameMaker applications)
Example Project
![Page 11: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/11.jpg)
FrameMaker MIF File
![Page 12: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/12.jpg)
HTML5 File
![Page 13: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/13.jpg)
Three Tasks
1. Consolidate TMs 2. Prepare the translation package for linguists 3. Post-process the files for delivery
![Page 14: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/14.jpg)
1) Translation Memories – TMX
• TMX (Translation Memory eXchange) is the standard way to store source text segments and their corresponding translations
• Supported by most CAT tools • Customer provided two TMs:
– Trados – Wordfast
![Page 15: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/15.jpg)
WordfastTM
TMX 1
TMX 2
Pensieve TM
Rainbow Toolbox
Trados TM
Four different tools sharing data through TMX
1) Combine TMs
![Page 16: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/16.jpg)
Three Tasks
1. Consolidate TMs 2. Prepare the translation package for linguists 3. Post-process the files for delivery
![Page 17: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/17.jpg)
2) HTML5 Extraction – ITS
• For XML and HTML5 documents, ITS (Internationalization Tag Set) describes what needs to be extracted and how to extract it
• W3C MultilingualWeb-LT WG is finishing the work on ITS 2.0
• Lets use ITS rules to identify localizable text in the HTML5 document
![Page 18: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/18.jpg)
2) ITS Rules
![Page 19: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/19.jpg)
HTML File
MIF File
ITS Rules
MIF Filter
HTM
L5 Filter
Content Extraction
ITS rules specify what needs to be translated
2) HTML5 Extraction – ITS
Pipeline Driven by Rainbow
![Page 20: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/20.jpg)
2) Segmentation – SRX
• Translation is done at the segment level – SRX (Segmentation Rules eXchange) describes
where to break or not break the content into segments
– Having the rules for source segments allows better re-usability of existing TM, increasing exact matches
– Maintain SRX rules with an SRX Editor
![Page 21: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/21.jpg)
2) Segmentation – SRX
Don’t break segment after VS. V.S. vs. or v.s.
![Page 22: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/22.jpg)
HTML File
MIF File
ITS Rules
Segmentation
MIF Filter
HTM
L5 Filter
SRX Rules Extraction
2) Segmentation – SRX
Pipeline Driven by Rainbow
SRX Rules are key to sharing TMs
![Page 23: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/23.jpg)
2) Translation Kit – XLIFF, TMX
• To flow through the translation process, the extracted content needs to be stored in a common format many tools understand – XLIFF (XML Localisation Interchange File
Format) is a standard way to represent extracted content
– TMX files with all the translation candidates found in the TM or from MT
![Page 24: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/24.jpg)
2) Translation Kit – XLIFF, TMX
Open Source OmegaT TM workbench
![Page 25: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/25.jpg)
Pipeline (Driven by Rainbow)
Translation Kit
Pensieve TM
HTML File
MIF File
ITS Rules
Segmentation
MIF Filter
Pre-translate unmatched
from MT
Pre-translate from TM
Translation Kit Creation
HTM
L5 Filter
SRX Rules
Microsoft MT
HTMLXLIFF
MIF XLIFF TMX Etc.
Extraction Pensieve TM Connector
Microsoft MT
Connector
2) Translation Kit – XLIFF, TMX Tool independent kit
![Page 26: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/26.jpg)
Three Tasks
1. Consolidate TMs 2. Prepare the translation package for linguists 3. Post-process the files for delivery
![Page 27: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/27.jpg)
Pipeline (Driven by Rainbow)
Translation Kit
MIF File
HTMLFile
Translator Kit
Filter
Translation Kit Post-
Processing
MIF Filter
HTML XLIFF
MIF XLIFF TMX Etc.
Extraction
HTM
L5 Filter
3) Post-Processing
![Page 28: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/28.jpg)
3) Translated FrameMaker MIF
![Page 29: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/29.jpg)
3) Translated HTML
![Page 30: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/30.jpg)
Three Tasks
1. Consolidate TMs 2. Prepare the translation package for linguists 3. Post-process the files for delivery
![Page 31: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/31.jpg)
Summary
• We Know – More about our standards – We can (and do) use them today
• Next Steps – Consider requiring Open Standards compliance with
the tools you use to ensure portability • Get Involved in the Standards Community
– GALA Standards Initiative – GALA Connect Groups
![Page 32: Understand Localization Standards and Use Them Effectively · Understand Localization Standards and Use Them Effectively John Watkins, President, ENLASO jwatkins@enlaso.com](https://reader034.vdocument.in/reader034/viewer/2022042216/5ebf606c9c4b115e0606ee1f/html5/thumbnails/32.jpg)
References
• GALA Standards Initiative http://www.gala-global.org/gala-standards-initiative
• TMX 1.4b – Translation Memory eXchange http://www.gala-global.org/oscarStandards/tmx/
• ITS 1.0 – Internationalization Tag Set http://www.w3.org/TR/its/
• SRX 2.0 – Segmentation Rules eXchange http://www.gala-global.org/oscarStandards/srx/
• XLIFF 1.2 – XML Localisation Interchange File Format http://docs.oasis-open.org/xliff/v1.2/os/xliff-core.html
• Okapi Framework (Open Source & cross-platform) http://code.google.com/p/okapi/