trend micro g11n team software globalization alan chang software engineer may 27, 2003
TRANSCRIPT
Trend Micro G11N Team
Software Globalization
Alan ChangSoftware Engineer
May 27, 2003
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
2
What is Software Globalization?
• G11N– 11 means the number of letters between the first and last
letters of the word “globalization”
• Includes:– Software Internationalization– Software Localization
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
3
Why Software Globalization?
• Earn more money from non-English markets with little extra development efforts
• Business Week: worldwide software revenues will be worth $270 billion by 2003
• Global Reach: Global e-commerce will reach $6.8 trillion by 2004
• More than 40% of Trend Micro’s revenue is from non-English markets
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
4
Software Internationalization
• I18N, First step of G11N– 18 means the number of letters between the
first and last letters of the word "internationalization"
• One Binary, Runs Globally– one set of source codes and main binary is
produced to support all the global markets – can be localized later without any code change,
merely by translating text, resizing UIs and supplying new graphics
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
5
Typical Software I18N Issues
• Writing System– Charset, Codeset, Encoding, Codepage– String manipulation– ……
• Culture Sensitive Data Formatting– Numeric Formatting– Date/Time Formatting– Message Formatting– ……
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
6
I18N Issue Example : Codepage
20 30 40 50 60 70 80 90 A0 B0 C0 D0 E0 F0
0 SP 0 @ P ` p € � NBSP ° À Ð à ð
1 ! 1 A Q a q � ‘ ¡ ± Á Ñ á ñ
2 " 2 B R b r ‚ ’ ¢ ² Â Ò â ò
3 # 3 C S c s ƒ “ £ ³ Ã Ó ã ó
4 $ 4 D T d t „ ” ¤ ´ Ä Ô ä ô
5 % 5 E U e u … • ¥ µ Å Õ å õ
6 & 6 F V f v † – ¦ ¶ Æ Ö æ ö
7 ‘ 7 G W g w ‡ — § · Ç × ç ÷
8 ( 8 H X h x ˆ ˜ ¨ ¸ È Ø è ø
9 ) 9 I Y i y ‰ ™ © ¹ É Ù é ù
A * : J Z j z Š š ª º Ê Ú ê ú
B + ; K [ k { ‹ › « » Ë Û ë û
C , < L \ l | Œ œ ¬ ¼ Ì Ü ì ü
D - = M ] m } � � SHY ½ Í Ý í ý
E . > N ^ n ~ Ž ž ® ¾ Î Þ î þ
F / ? O _ o DEL � Ÿ ¯ ¿ Ï ß ï ÿ
40 50 60 70 80 90 A0 B0 C0 D0 E0 F0
0 SP & - ø Ø º µ ^ { } \ 01 NBSP é / É a j ~ £ A J ÷ 12 â ê Â Ê b k s ¥ B K S 23 ä ë Ä Ë c l t · C L T 34 à è À È d m u © D M U 45 á í Á Í e n v § E N V 56 ã î Ã Î f o w ¶ F O W 67 å ï Å Ï g p x ¼ G P X 78 ç ì Ç Ì h q y ½ H Q Y 89 ñ ß Ñ ` i r z ¾ I R Z 9A ¢ ! ¦ : « ª ¡ [ SHY ¹ ² ³B . $ , # » ° ¿ ] ô û Ô ÛC < * % @ ð æ Ð ¯ ö ü Ö ÜD ( ) _ ' ý ¸ Ý ¨ ò ù Ò ÙE + ; > = þ Æ Þ ´ ó ú Ó ÚF | ¬ ? " ± ¤ ® × õ ÿ Õ
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
7
I18N Issue Example : Codepage Conversion
String in EUC-JP encoding:
Same EUC string displayed under Japanese Windows
environment:
Same EUC string displayed correctly after being converted into
Shift-JIS encoding:
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
8
I18N Issue Example : String Manipulation
• General Issues of Character:– Variable length, Fixed length, Multi-byte, Single-byte– Modal,Non-Modal
• Iteration– (NON-I18N) ++psz, --psz, _strinc, _strdec…
• Character Classification/Searching– (NON-I18N) isalpha, isdigit, strchr, strrchr …
• Character Manipulation/Calculation– (NON-I18N) tolower, toupper …– (NON-I18N) char c = ‘R’; c += 1; /* is c == ‘S’???*/
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
9
I18N Issue Example : String Manipulation
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
10
I18N Issue Example :Numeric Formatting
• Group and decimal separators– 1,234.56 (US) – 1.234,56 (DE) – 1 234,56 (FR)
• Numeric shapes– 2 二 貳 ๒ ٢
• Negative numbers– -1,234 1,234-
• Percentage symbols and placement– 45% .45 45 pct %45
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
11
I18N Issue Example :Monetary Formatting
• The currency symbol is same on all platforms– Developer might make culture
assumption that the “$” is used as currency symbol around the world
• Monetary format examples:– $1,234.56 (US)– 1.234,56 € (DE)– ¥ 1,234.56 (JA)
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
12
I18N Issue Example :Date/Time Formatting
• Short, Medium, Long, Full– 11/06/2001 (Short)– Monday, October 22, 2001 2:23AM (Long)
• Calendar System– 平成 , AD, BC, 農曆 , 民國 (Era)– Thursday, 木曜日 ,Donnerstag, 星期四 (Day of the Week)– October, Octobre, 十月– AM PM, 早上 下午 , 午前
• Time Zone– GMT, GMT-8 (Standard Time) – Daylight savings Time (Local Time with offset)– Europe/Dublin, PST, PDT (Time Zone ID)
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
13
I18N Issue Example :Date/Time Formatting (cont.)
• Date format examples:– Thu 01/17/2002 (US)– Thu 17/01/2002 (UK)– Do 17.01.2002 (DE)– 2002/01/17 木 (JA)
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
14
I18N Issue Example :Message Formatting
• “There are %d virus found in file %s !!“– sprintf(szMyMsg, szFormat, nFile, szFileName);
• “ 在 %s 中找到 %d 隻病毒 !!”– sprintf(szMyMsg, szFormat, szFileName, nFile);
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
15
I18N Issue Example :Address/Phone Number Formatting
• The address pattern is same on all platforms– Developer might make culture
assumption that the address pattern, “Street Number”, “City”, “State”, “ZIP” can cover all the world
– Same situation may occur in the phone number format
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
16
I18N Issue Example :Buffer Capacity
• Developers might allocate a buffer which is not big enough to load the resource string after translation
• Result is crash or chopped string
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
17
I18N Issue Example :UI Truncation & Layout
• If an English string on UI becomes longer after translation, then it might be truncated or wrapped because of the limited UI layout
• Also called Localizability problem
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
18
I18N Issue Example :Hard-Code String
• Developers might hard-code messages and forgot to move it to resource file
• Result is the message always in English no matter on which platform after localization
• Most often I18N bug
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
19
Software Localization
• L10N, Second step of G11N– 10 means the number of letters between the
first and last letters of the word “localization"• Customization for target market
– Usually involves translating text and supplying new graphics
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
20
Typical Software L10N Issues
• Culture Difference– UI Text Translation– UI Layout Customization– Documentation Translation– Icon Localization– Legal Issues– ……
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
21
L10N Issue Example : Website Layout Localization
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
22
Good Practices for Software Globalization
• Verify the needs of Software Globalization for your products
• Consider about Software Globalization in the early stage of product development cycle
• Avoid culture assumptions• Externalize L10N related resources• Adopt industry standard solution (Microsoft NLS
APIs, IBM ICU, Unicode)• Think more for L10N while coding
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
23
A Cold Joke of Globalization
• What year should be the start year of millennium– 2000? – 2001?
• This is my answer: – 2001 will be the answer from better G11N aspect. – Because no one will be confused by the date 01/01/01
Trend Micro CONFIDENTIAL- ENGINEERING DOCUMENT
24
Q&A