adam goucher i18n and l10n
DESCRIPTION
Slides from my recent presentation on I18N and L10N at GLSEC 2007TRANSCRIPT
www.jonahgroup.com [email protected](416) 304-0860
I18N & L10Na technical primer
Adam GoucherSenior Quality Specialist, Jonah Group
http://www.jonahgroup.comhttp://adam.goucher.ca
www.jonahgroup.com [email protected](416) 304-0860
Definitions
Internationalization I + 18 chars + N I18N• Your application can accept, store, manipulate,
retrieve and display text in the user’s native language
Localization L + 10 chars + N L10N• Your application looks as if it was designed for
the locale it is being used in
www.jonahgroup.com [email protected](416) 304-0860
The Problem
English is the native language of only ~ 30% of the Internet’s population.
To not alienate the other 70% of your potential customers, you need to worry about I18N and L10N.
www.jonahgroup.com [email protected](416) 304-0860
Don’t worry
I18N and L10N are technical problems, not linguistic ones.
Programmers and testers know how to solve technical problems.
Translation is the linguistic problem.Translators know how to solve linguistic
problems.
www.jonahgroup.com [email protected](416) 304-0860
Unicode
Unicode1 provides a unique number:• for every character• no matter what the platform• no matter what the program• no matter what the language
There are a number of ways (called Encodings) to represent a Unicode code point (single character)• UTF-82 is an 8 bit, variable length encoding• UTF-8 is the de facto standard
1 http://www.unicode.org2 http://en.wikipedia.org/wiki/UTF-8
www.jonahgroup.com [email protected](416) 304-0860
Resource Bundles
One of the more difficult things to get right is all the string data embedded in your source code.
The easiest solution here is to use resource bundles (locale specific collections of string data)
www.jonahgroup.com [email protected](416) 304-0860
String Rules
Like most tools, resource bundles can make your life difficult if not done correctly.
• Do not build strings to display by concatenating strings. This increases translation difficulty by removing context
• Include all punctuation in bundle content to avoid correct translation content, but incorrect punctuation
• Include formatting in bundle content
www.jonahgroup.com [email protected](416) 304-0860
Resource Bundle Tests
• LOUD3 to check for string rules• Resource key not in code• Resource key in code, but not bundle• Key present (or missing) from different
locales
3 http://adam.goucher.ca/?p=28
www.jonahgroup.com [email protected](416) 304-0860
Other areas
I18N and L10N is a huge topic. Some of what has not been discussed:
• Date / Time• Numbers• Currency• Username / Password conventions• Postal / Zip Codes• Paper size (when printing)
www.jonahgroup.com [email protected](416) 304-0860
Testing Advice
• Test your application’s I18N and L10N early to avoid having to re-test everything.
• Include as many checks as possible during the build process
• Beta test translations with friendly customers
www.jonahgroup.com [email protected](416) 304-0860
Summary
• This is a technical problem, not a linguistic one
• Use UTF-8 everywhere you can• Use resource bundles instead of putting
literal strings in the code• Learn about the nuances of your target
locales• Test early