unicode and collation support in microsoft sql server michael s. kaplan globalization infrastructure...

26
Unicode and Collation Unicode and Collation Support in Microsoft Support in Microsoft SQL Server SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

Upload: riley-corbett

Post on 26-Mar-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

Unicode and Collation Support Unicode and Collation Support in Microsoft SQL Serverin Microsoft SQL Server

Michael S. KaplanGlobalization Infrastructure and Font Technology

Windows International

Microsoft

Page 2: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Unicode SupportUnicode Support

Uses the "N" or national data types from the SQL-92 specification

NCHAR, NVARCHAR, NTEXTWhat the SQL-99 spec says about UnicodeInteroperability with other clients

Page 3: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Collation in SQL Server <= 6.5Collation in SQL Server <= 6.5

No Unicode support at allOne code page per serverOne collation per serverNo good solution for multilingual support

Page 4: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Collation in SQL Server 7.0Collation in SQL Server 7.0

Unicode datatypes supportedTwo collations

– Unicode– Non-Unicode

Number of collations distilled down to the minimum necessary

Page 5: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

7.0 flattening of collations7.0 flattening of collations

Example: the General Unicode sort order handles: Afrikaans, Albanian, Arabic, Basque, Belarusian, Bulgarian, English, Faeroese, Farsi, Georgian (Traditional), Greek, Hebrew, Hindi, Indonesian, Malay, Russian, Serbian, Swahili, and Urdu

Page 6: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

OS independenceOS independence

Collation independent of operating systemBased on the Jet “Unicorn” DLLs

Page 7: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

SQL Language SupportSQL Language Support(limited locale information)(limited locale information)

Messages Date/Time First Day of Week Currency and currency symbols Month/day names and abbreviated month

names

Page 8: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

SQL Language SupportSQL Language Support(list of languages)(list of languages)

Arabic British English Brazilian Bulgarian Simplified Chinese Traditional Chinese Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hungarian

Italian Japanese Korean Latvian Lithuanian Norwegian Polish Portuguese Romanian Russian Slovak Slovenian Spanish Swedish Thai Turkish

Page 9: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Getting at the list of languagesGetting at the list of languages

sp_helplanguage stored proceduresyslanguages/sysmessages tablesSET LANGUAGE

– SET LANGUAGE čeština– SET LANGUAGE 한국어

Each language has a langid (0 – 32)

Page 10: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Collation in SQL Server 2000Collation in SQL Server 2000

Combined code pages and collations into a single entity

Page 11: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

"Windows" collations"Windows" collations

Added for unique code pages(Example – Arabic)

Added for unique ordering (Example – French)

Removed for identical ordering(Example – Finnish_Swedish)

Page 12: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

43 Windows Collations43 Windows Collations Albanian Arabic Chinese_PRC Chinese_PRC_Stroke Chinese_Taiwan_Bopomofo Chinese_Taiwan_Stroke Cyrillic_General Croatian Czech Danish_Norwegian Estonian Finnish_Swedish French Georgian_Modern_sort German_PhoneBook Greek Hebrew Hindi Hungarian Hungarian_Technical Icelandic Japanese

Japanese_Unicode Korean_Wansung Korean_Wansung_Unicode Latin1_General Latvian Lithuanian Lithuanian_Classic FYRO Macedonian Spanish (Spain) Polish Romanian Slovak Slovenian Thai Traditional_Spanish Turkish Ukrainian Vietnamese  

Page 13: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Windows collations, continuedWindows collations, continued

Suffix meanings– _BIN (Binary)– _CI/_CS (Case sensitivity)– _AI/_AS (Accent sensitivity)– _KS - kanatype sensitivity (hiragana/katakana)– _WS - width sensitivity (full/half width)

Page 14: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

SQL CollationsSQL Collations

Provided for backwards compatibility with prior versions of SQL Server

Page 15: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

SQL CollationsSQL Collations SQL_1xCompat_CP850 SQL_Estonian_CP1257 SQL_Latin1_General_Pref_CP437 SQL_AltDiction_CP1253 SQL_Hungarian_CP1250 SQL_Latin1_General_Pref_CP850 SQL_AltDiction_CP850 SQL_Icelandic_Pref_CP1 SQL_Latvian_CP1257 SQL_AltDiction_Pref_CP850 SQL_Latin1_General_CP1 SQL_Lithuanian_CP1257 SQL_Croatian_CP1250 SQL_Latin1_General_CP1250 SQL_MixDiction_CP1253

SQL_Czech_CP1250 SQL_Latin1_General_CP1251 SQL_Polish_CP1250 SQL_Danish_Pref_CP1 SQL_Latin1_General_CP1253 SQL_Romanian_CP1250 SQL_EBCDIC037_CP1 SQL_Latin1_General_CP1254 SQL_Scandinavian_CP850 SQL_EBCDIC273_CP1 SQL_Latin1_General_CP1255 SQL_Scandinavian_Pref_CP850 SQL_EBCDIC277_CP1 SQL_Latin1_General_CP1256 SQL_Slovak_CP1250

SQL_EBCDIC278_CP1 SQL_Latin1_General_CP1257 SQL_Slovenian_CP1250 SQL_EBCDIC280_CP1 SQL_Latin1_General_CP437 SQL_SwedishPhone_Pref_CP1 SQL_EBCDIC284_CP1 SQL_Latin1_General_CP850 SQL_SwedishStd_Pref_CP1 SQL_EBCDIC285_CP1 SQL_Latin1_General_Pref_CP1 SQL_Ukrainian_CP1251 SQL_AltDiction_CP1253 SQL_Hungarian_CP1250  SQL_Latin1_General_Pref_CP850  

Page 16: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Collation at four levelsCollation at four levels

ServerDatabaseColumnExpression

Page 17: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

At the server levelAt the server level

Acts as a default for all databasesCan be changed with RebuildM.exe in the

tools\BINN dirQuerying the server collation:

SELECT CONVERT(char, SERVERPROPERTY('collation'))

Page 18: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

At the database levelAt the database level

Every database has a collation (default is the server collation)

Collation can be changed under some circumstances

Page 19: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

At the column levelAt the column level

Overrides database level collationSpecifies code page for non-Unicode

columnsAgain, can be changed under some

circumstancesNo multilingual columns with separate

collations

Page 20: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

At the expression levelAt the expression level

Can be used to override any other collationuses the COLLATE keyword

Page 21: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Metadata in System TablesMetadata in System Tables

All stored as Unicode no matter what the database collation is

Unicode 2.0 repertoire is used for identifiers (use brackets or quotes around anything else)

Page 22: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

More on the COLLATE keywordMore on the COLLATE keyword

COLLATE [<Windows_Collation_name>|<SQL_Collation_Name]

Specific rules of precedence:– Explicit (two explicits == runtime error)– Implicit (two implicits == no collation)– Default– <no collation>

Page 23: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

LimitationsLimitations

Features people will want for future versions– LCID --> Collation– ISO string <--> Collation– Creating custom collations?

Page 24: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

ReferencesReferences

http://microsoft.com/globaldev/ “International Features in Microsoft SQL Server

2000”

(by Michael Kaplan) at http://msdn.microsoft.com/

Page 25: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Questions?Questions?

Page 26: Unicode and Collation Support in Microsoft SQL Server Michael S. Kaplan Globalization Infrastructure and Font Technology Windows International Microsoft

24-26 March 2003 Prague, Czech Republic (IUC23)

Unicode and Collation Support

in Microsoft SQL Server

Don’t Forget Your Evaluations!