proposal for an oriya script root zone label …proposal for an oriya root zone lgr neo-brahmi...

31
Proposal for an Oriya Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2018-08-08 Document version: 2.9 Authors: Neo-Brahmi Generation Panel [NBGP] 1 General Information/ Overview/ Abstract The purpose of this document is to give an overview of the proposed Root Zone Level Generation Rules for the Oriya script. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: Proposal-LGR-Orya-20180808.xml Labels for testing can be found in the accompanying text document: Oriya-Test-Labels-20180808.txt 2 Script for which the LGR is proposed ISO 15924 Code: Orya ISO 15924 Key N°: 327 ISO 15924 English Name: Oriya (Odia) Latin transliteration of native script name: oḍiā Native name of the script: ଓଡ଼ିଆ Maximal Starting Repertoire (MSR) version: MSR-3 3 Background on Script and Principal Languages using it Odia (known in Unicode as Oriya) is an Eastern Indic language spoken by about 40 million people mainly in the Indian state of Odisha (Orissa), and also in parts of West Bengal, Jharkhand, Chhattisgarh and Andhra Pradesh. Oriya (Odia) is one of the many official

Upload: others

Post on 11-Apr-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

  • ProposalforanOriyaScriptRootZoneLabelGenerationRuleset(LGR)

    LGRVersion: 3.0Date: 2018-08-08Documentversion: 2.9Authors: Neo-BrahmiGenerationPanel[NBGP]

    1 GeneralInformation/Overview/Abstract

    ThepurposeofthisdocumentistogiveanoverviewoftheproposedRootZoneLevelGenerationRulesfortheOriyascript.Itincludesadiscussionofrelevantfeaturesofthescript,thecommunitiesorlanguagesusingit,theprocessandmethodologyusedandinformationonthecontributors.TheformalspecificationoftheLGRcanbefoundintheaccompanyingXMLdocument:

    Proposal-LGR-Orya-20180808.xml

    Labelsfortestingcanbefoundintheaccompanyingtextdocument:

    Oriya-Test-Labels-20180808.txt

    2 ScriptforwhichtheLGRisproposed

    ISO15924Code:Orya

    ISO15924KeyN°:327

    ISO15924EnglishName:Oriya(Odia)

    Latintransliterationofnativescriptname:oḍiā

    Nativenameofthescript:ଓଡ଼ିଆ

    MaximalStartingRepertoire(MSR)version:MSR-3

    3 BackgroundonScriptandPrincipalLanguagesusingit

    Odia(knowninUnicodeasOriya)isanEasternIndiclanguagespokenbyabout40millionpeoplemainlyintheIndianstateofOdisha(Orissa),andalsoinpartsofWestBengal,Jharkhand,ChhattisgarhandAndhraPradesh.Oriya(Odia)isoneofthemanyofficial

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    2

    languagesofIndia.ItistheofficiallanguageofOdisha,andthesecondofficiallanguageofJharkhand.EminentLinguistslikeJohnBeames,G.A.Grierson,L.S.S.O’Malley,SunitiKumarChatterjee,S.N.Rajaguru,JohnBoultonandothersconsiderOdiaasoneofthemostancientlanguagesofIndia.InIndicfamilyoflanguages,Oriya(Odia)isclosesttoSanskritandleastinfluencedbyforeignlanguages.OnlythesetwoIndiclanguages(viz.SanskritandOdia)havegotclassicaltagduetotheirrich,uninfluencedandlongliteraryhistory.AccordingtoNationalMissionforManuscripts,afterSanskrit(11,66,743),Odia(2,13,088)hasthelargestnumberofdocumentedmanuscriptsintheIndia.

    OdiawaspreviouslyspeltasOriya,andOdishaasOrissa.However,OdiaandOdishaarenowthepreferrednamesofficiallyinEnglishastheyareclosertotheirnativenames:ଓଡ଼ିଆ(oḍiā)[ɔɖiaː]andଓଡ଼ିଶା(oḍiśā)[ɔɖisaː].

    WithreferencetoWikipedia(https://en.wikipedia.org/wiki/Odia_language)Oriya(Odia)languageisalsousedbyminoritypopulationsoftheneighboringstatesofJharkhand,WestBengal,ChhattisgarhandAndhraPradesh.TheregionhasbeenknownatdifferentstagesofhistoryasKalinga,Udra,UtkalaorKoshala.Odishawasavastempireinancientandmedievaltimes,extendingfromtheGangesinthenorthtotheGodavariinthesouth.DuringBritishrule,however,OdishalostitspoliticalidentityandformedpartsoftheBengalandMadrasPresidencies.ThepresentstateofOdishawasformedin1936.

    ThemodernOriya(Odia)languageisformedmostlyfromPaliwordswithsignificantSanskritinfluence.About28%ofmodernOriya(Odia)wordshaveAdivasiorigins,andabout2%haveHindustani(Hindi/Urdu),Persian,orArabicorigins.Theearliestwrittentextsinthelanguageareaboutthousandyearsold.ThefirstOriya(Odia)newspaperwasUtkalaDeepikafirstpublishedon4August1866.

    AmongtheIndo-EuropeanlanguagesofIndia,onlyOriya(Odia)andSanskrithavebeenrecognizedasclassicallanguages;andofthesixIndianlanguagesthathavebeenconferredclassicallanguagestatusOriya(Odia)wasrecognizedmostrecently(in2014).1ItformsthebasisofOdissidanceandOdissimusic.2

    OriyascriptseemstobeavariantofDevanāgarī,themaindifferencebeingtheabsenceoftheshirorekhaorthelineabovethecharacterandalsoitsmoreroundedshapes.Sinceinitiallyitwasusedforcommercialends,ithasbeenreferredtoasśarāphi(banker's)ormahājani(trader's)script.

    1Criteriaforthisstatusinclude:highantiquityofitsearlytexts/recordedhistoryoveraperiodof1500–2000years;abodyofancientliterature/texts,whichisconsideredavaluableheritagebygenerationsofspeakers;aliterarytraditionthatisoriginalandnotborrowedfromanotherspeechcommunity;2ThevarietyofOriyadialectsetc.isreviewedinAppendixB.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    3

    TheOriya(Odia)scriptisusedtowriteOriya(Odia)languageandanumberofotherlanguagesspokeninOdishasuchasMunda,Santali,Kui,HoandSanskrit.

    3.1 TheEvolutionoftheScript

    TheOriya(Odia)scriptdevelopedfromtheKalingascript,oneofthemanydescendantsoftheBrahmiscriptofancientIndia.(Rajaguru,S.N.,OdiaLipiraKramabikash,OdiaSahityaAkademi,page2).TheearliestknowninscriptionintheOriya(Odia)language,intheKalingascript,datesfrom1051.ItdescendsfromOdra-MagadhiPrakritsimilartoArdhaMagadhi,prevalentineasternIndiaover1,500yearsago.

    ThecurvedappearanceoftheOriyascriptisaresultofthepracticeofwritingonpalmleaves,whichhaveatendencytotearifwritteninstraightlines.

    ThediagrambelowshowsthemajorstagesintheevolutionofOriyaattestingitslatedivergencefromDevanāgarī.

    Figure1:PictorialdepictionofEvolutionofOriya

    LiteratureintheOriya(Odia)language(Odia: )isthepredominantliterature

    ofthestateofOdishainIndia.

    3.2 PeriodsofOdiaHistory

    Oriya(Odia)languageliterature(Odia: )isthepredominantliteratureofthestateofOdishainIndia.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    4

    HistorianshavedividedthehistoryoftheOriya(Odia)literatureintofivemainstages:OldOriya(Odia)(8thcenturyto1300),EarlyMiddleOriya(Odia)(1300to1500),MiddleOriya(Odia)(1500to1700),LateMiddleOriya(Odia)(1700to1850)andModernOriya(Odia)(1850topresent).

    3.3 UseofOriyalanguagebeyondIndia

    AccordingtoWikipedia,https://en.wikipedia.org/wiki/Odia_languagetheOriya(Odia)diasporaconstitutesasizeablenumberofspeakersinseveralcountriesaroundtheworld,pushingthenumberofOriya(Odia)speakersgloballyto55million.

    IthasasignificantpresenceineasterncountriessuchasBangladesh,Indonesia,mainlycarriedbythesadhaba,ancienttradersfromOdisha,whocarriedthelanguagealongwiththecultureduringtheold-daytrading,andinwesterncountriessuchastheUnitedStates,Canada,AustraliaandEnglandaswell.ThelanguagehasalsospreadtoBurma,Malaysia,Fiji,SriLankaandcountriesoftheMiddleEast.WrittenOriya(Odia)(orthestandardOriya(Odia)isusedforofficialpurpose.IthaselementsfromdifferentlocalOriya(Odia)dialectsbutitusuallyavoidswordsofforeignoriginsuchasArabicandPersian.IthasalsoassimilatedmanytribalwordsprevalentinOdisha.

    3.4 Notablefeatures

    TheOriyascriptisasyllabicalphabetwrittenlefttorightinhorizontallines,inwhichallconsonantshaveaninherentvowel.Diacritics,whichcanappearabove,below,beforeoraftertheconsonanttheybelongto,areusedtochangetheinherentvowel.

    Whentheyappearatthebeginningofasyllable,vowelsarewrittenasindependentletters.

    Whencertainconsonantsoccurtogether,specialconjunctshapesareusedwhichcombinetheessentialpartsofeachletter.

    ThechartbelowshowsthewayinwhichtheInternationalPhoneticAlphabet(IPA)representsOriya(Odia)pronunciations.VargyaConsonants

    AvargyaconsonantsIPA Oriya(Odia) IPA Oriya(Odia)b ବ ଯ bʱ ଭ j ୟ d̪ ଦ ɾ ର d̪ʱ ଧ l ଲ ɖ ଡ l̪ ଳ

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    5

    ɖ ɦ ଢ ʋ ଵ dx ʒ ଜ w ୱ dx ʒ ɦ ଝ s ସ ɡ ଗ ʂ ଷ ɡʱ ଘ ɕ ଶ h ହ ɦ ହ k କ ŋ,ɲ,ɳ,n,m,◌̃ ◌ଂ kʰ ଖ ◌̃ ◌ଁ ɲ ଞ ɦ ◌ଃ m ମ VowelsandMatras n ନ IPA Vowels Matras ɳ ଣ ə ଅ ŋ ଙ aː ଆ ◌ା p ପ ɪ ଇ ◌ି p h ଫ iː ଈ ◌ୀ ɾ ର ʊ ଉ ◌ୁ ɽ ଡ଼ uː ଊ ◌ୂ ɽ ɦ ଢ଼ r̩ ଋ ◌ୃ s ସ eː ଏ e t̪ ତ ɛː ଐ eୖ t̪ʰ ଥ oː ଓ eା ʈ ଟ ɔː ଔ eୗ ʈ h ଠ t͡ʃ ଚ t͡ʃʰ ଛ

    Table1:InternationalPhoneticAlphabetOriyaPronunciations

    3.5 Structuredconsonants

    Thestructuredconsonantsareclassifiedaccordingtotheplaceofarticulationandareclassifiedaccordinglyintofivestructuredgroups.TheseconsonantsareshownherewiththeirIAST(InternationalAlphabetofSanskritTransliteration)3transcriptions.

    3InternationalAlphabetofSanskritTransliteration(I.A.S.T.)isatransliterationschemethatallowsthelosslessromanizationofIndicscriptsasemployedbySanskritandrelatedIndiclanguages.IASTmakesit

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    6

    voiceless voicelessaspirate voiced voicedaspirate nasal

    Velars କ(ka) ଖ(kha) ଗ(ga) ଘ(gha) ଙ(ṅa)

    Palatals ଚ(ca) ଛ(cha) ଜ(ja) ଝ(jha) ଞ(ña)

    Retroflex ଟ(ṭa) ଠ(ṭha) ଡ(ḍa) ଢ(ḍha) ଣ(ṇa)

    Dentals ତ(ta) ଥ(tha) ଦ(da) ଧ(dha) ନ(na)

    Labials ପ(pa) ଫ(pha) ବ(ba) ଭ(bha) ମ(ma) Table2:structuredconsonants

    3.6 Unstructuredconsonants

    Theunstructuredconsonantsareconsonantsthatdonotfallintoanyoftheabovecategories:ଯ(ja),ୟ(ia),ର(ra),ଲ(la),ଳ(ḷa),ଵ(va),ୱ(wa),ଶ(sa),ଷ(sa),ସ(sa),ହ(ha)

    3.7 TheimplicitvowelkillerHalant(Virama)

    Halantcharacterisusedafteraconsonantto"strip"itofitsinherentvowel.Aconsonantsyllablecannotendwithhalant.Withafewexceptions,mostoftheOriyawordsaresvaranta(i.eendingwithavowel).

    Asyllablecontaininghalantcharactersmaybeshapedwithnovisiblehalantsigns,asthehalantsenabledifferentconsonantstoformconjuncts.

    Halantformofconsonants-Theformproducedbyaddingthehalanttothenominalshape.Thehalantformisusedinsyllablesthathavenovowelorasthehalfformwhennodistinctshapeforthehalfformexists.

    Halfformofconsonants(pre-baseform)-Avariantformofconsonantswhichappeartotheleftofthebaseconsonant,iftheydonotparticipateinaligature.Consonantsintheirhalfformprecedetheonesformingthebaseglyph.SomeIndicscripts,likeDevanagarihavedistinctlyshapedhalfformsformostoftheconsonants.Ifnodistinctshapeexists,thefullformwilldisplaywithanexplicitVirama(sameshapeasthehalantform).

    possibleforthereadertoreadtheIndictextunambiguously,exactlyasifitwereintheoriginalIndicscript.Example:କ0B15(ka),ଖ0B16(kha)etc.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    7

    3.8 Nukta( ଼–U+0B3C):

    ThenuktasignisusedinOriyalanguagejustlikeanyotherIndianscripts.ItisusedwithafewconsonantstorepresentsoundsfoundonlyinwordsborrowedfromPerso-Arabic.Itcanbecommonlyusedwith“ଡ”U+0B21,“ଢ”U+0B22,“କ”U+0B15,“ଖ”U+0B16,“ଗ”U+0B17, “ଚ”U+0B1A,“ଜ”U+0B1C,and“ଫ”U+0B2BtoshowthatwordshavingtheseconsonantswithanuktaaretobepronouncedinthePerso-Arabicstyle.

    3.9 Visarga“◌ଃ”(U+0B03)andAvagraha“ଽ”(U+0B3D):

    TheVisarga(“◌ଃ”(U+0B03)isfrequentlyusedinSanskritandrepresentsasoundverycloseto/h/.Example,ଦୁଃଖ/du:kh/sorrow(U+0B26U+0B41U+0B03U+0B16).TheAvagraha"ଽ"(U+0B3D)createsanextrastressontheprecedingvowelandisusedinSanskrittexts.ItisrarelyusedinotherlanguagesusingOriya.IncaseofLGR,theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire.

    3.10 Candrabindu(◌ँ-U+0B01):

    Candrabindudenotesnasalizationoftheprecedingvowelandconsonantsasinଅଁଳା/ãala/nameofseasonalfruit(U+0B05U+0B01U+0B33U+0B3E).OriyauserscommonlyuseitforwritingthewordsandsoundsofSanskritlanguage.

    3.11 Anusvara(◌ଂ-U+0B02):

    AnusvarareplacesaconjunctgroupofaNasalConsonant+Halant+Consonantbelongingtothatparticularvarga.TheAnusvararepresentsahomorganicnasal.Beforeanon-vargaconsonanttheAnusvararepresentsanasalsound.Forexample:ଏବଂ(0B0F+0B2C+0B02),ସଂ_ୟା(0B38+0B02+0B16+0B4D+0B5F+0B3E),etc.

    3.12 Matrasign(DependentVowel)

    Itisusedtorepresentavowelsoundthatisnotinherenttotheconsonant.Dependentvowelsarereferredtoas"matras".Theyarealwaysdepictedincombinationwithasingleconsonant,orwithaconsonantcluster.ThegreatestvariationamongdifferentIndianscriptsisfoundintherulesforattachingdependentvowelstobasecharacters.TherulesspecifictoOriyaarementionedinSection6(Variants)andSection7(WLERules).

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    8

    Followingtableexplainsthecorrelationbetweenavowelanditsmatrasign.

    Vowelanditsmatra-signGlyph Unicode Name Glyph Unicode Nameଅ U+0B05 ORIYALETTERA ଆ U+0B06 ORIYAVOWEL

    LETTERAA◌ା U+0B3E ORIYAVOWELSIGN

    AAଇ U+0B07 ORIYAVOWEL

    LETTERI◌ି◌ U+0B3F ORIYAVOWELSIGN

    Iଈ U+0B08 ORIYAVOWEL

    LETTERII◌ୀ U+0B40 ORIYAVOWELSIGN

    IIଉ U+0B09 ORIYAVOWEL

    LETTERU◌ୁ◌ U+0B41 ORIYAVOWELSIGN

    Uଊ U+0B0A ORIYAVOWEL

    LETTERUU◌ୂ◌ U+0B42 ORIYAVOWELSIGN

    UUଋ U+0B0B ORIYAVOWEL

    LETTERVOCALICR◌ୃ◌ U+0B43 ORIYAVOWELSIGN

    VOCALICRଏ U+0B0F ORIYAVOWEL

    LETTEREe U+0B47 ORIYAVOWELSIGN

    Eଐ U+0B10 ORIYAVOWEL

    LETTERAIeୖ U+0B48 ORIYAVOWELSIGN

    AIଓ U+0B13 ORIYAVOWEL

    LETTEROeା U+0B4B ORIYAVOWELSIGN

    Oଔ U+0B14 ORIYAVOWEL

    LETTERAUeୗ U+0B4C ORIYAVOWELSIGN

    AUଌ U+0B0C ORIYALETTER

    VOCALICL◌ୢ◌ U+0B62 ORIYAVOWELSIGN

    VOCALICLୡ U+0B61 ORIYALETTER

    VOCALICLL◌ୣ◌ U+0B63 ORIYAVOWELSIGN

    VOCALICLLTable3:VowelanditsMatraSign

    “ଌ”U+0B0C,“ୡ”U+0B61,“◌ୢ”U+0B62and ”◌ୣ”U+0B63arehardlyinuseinmoderndays.

    4 OverallDevelopmentProcessandMethodology

    UndertheNeo-BrahmiGenerationPanel,therearemanydifferentscriptsbelongingtoseparateUnicodeblocks.EachofthesescriptsisthebasisforaseparateLGRproposal;howeverNeo-BrahmiGPensuresthatthefundamentalphilosophybehindbuildingthesevariousscriptLGRsissameacrossthedifferentscriptsbeingconsidered.ThisistheOriya(Odia)LGR,whichcaterstoOriya(Odia)languageswrittenusingOriya(Odia)belongingtoEGIDSscale1to4.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    9

    4.1 GuidingPrinciples

    TheNBGPadoptsthefollowingbroadprinciplesfortheselectionofcode-pointsintherepertoireacrosstheboardforallthescriptswithinitsscope.

    4.1.1 Inclusionprinciples:

    4.1.1.1 Modernusage:Everycharacterproposedshouldbeintheeverydayusageofaparticularlinguisticcommunity.ThecharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposesonlyorforarchivalpurposeswillnotbeconsideredforinclusioninthecode-pointrepertoire.

    4.1.1.2.Unambiguoususe:

    Everycharacterproposedshouldhaveunambiguousunderstandingamongthelinguisticaboutitsusageinthelanguage.

    4.1.2 Exclusionprinciples:

    ThemainexclusionprincipleisthatofExternalLimitsonScope.Thesecompriseofprotocolsorstandardswhicharepre-requisitestotheLabelGenerationRulesetc.Allfurtherprinciplesareinfactsubsumedundertheselimitationsbuthavebeenspeltoutseparatelyforthesakeofclarity.

    4.1.2.1 ExternalLimitsonScope:Thecodepointrepertoireforrootzonebeingaveryspecialcase,uptheladderintheprotocolhierarchies,thecanvasofavailablecharactersforselectionasapartoftheRootZonecodepointrepertoireisalreadyconstrainedbyvariousprotocollayersbeneathit.Thefollowingthreemainprotocols/standardsactassuccessivefilters:

    i.TheUnicodeChart:

    Outofallthecharactersthatareneededbythegivenscript,ifthecharacterinquestionisnotencodedinUnicode,itcannotbeincorporatedinthecodepointrepertoire.Suchcasesarequiterare,giventheelaborateandexhaustivecharacterinclusioneffortsmadebyUnicodeconsortium.

    ii.IDNAProtocol:

    Unicodebeingthecharacterencodingstandardforprovidingthemaximumpossiblerepresentationofagivenscript/language,ithasencodedasfaraspossibleallthepossiblecharactersneededbythescript.However,theDomainnamebeingaspecializedcase,itisgovernedbyanadditionalprotocolknownasIDNA(InternationalizedDomainNamesin

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    10

    Applications).TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames.

    Example:Oriyascriptfrequentlyuses“ଡ”(U+0B21),“ଢ”(U+0B22)aswellastheirrespectiveallophones“ଡ଼”,and“ଢ଼”.InOriya(Odia)script,thesedifferinuseofnukta.Thus“ଡ଼”and“ଢ଼”asdistinctlettersarenotallowedbuttheirdecomposedformi.e.“ଡ”, “ଢ”followedbyOriya(Odia)signnukta(U+0B3C)canbeused.Similarly,forallophonesofotherconsonantslikeକ(U+0B15),ଖ(U+0B16),ଗ(U+0B17),ଚ(U+0B1A),ଜ(U+0B1C),ଫ(U+0B2B)nuktacanbeused.

    iii.MaximalStartingRepertoire:

    AstheRootZoneLGRisusedforcreationoftherootzoneTLDs,whichinturnareanevenmorespecializedcaseofdomainnamelabels,theRootZoneLGRprocedureintroducesadditionalexclusionsforcharactersallowedbyIDNA.

    Example:OriyaSignAvagraha"ଽ"(U+0B3D)evenifallowedbyIDNAprotocol,isnotpermittedintheRootZoneRepertoireasperthe[MSR].

    MaximalStartingRepertoirealsoexcludesinvisiblecharactersZeroWidthNon-Joiner(U+200C)andZeroWidthJoiner(U+200D).Thesearerequiredincertaincaseswhereatypicalvisualshapeofanaksharisdesired.

    Tosumup,therestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthecode-blockofthegivenscript/language.ThisisfurthernarroweddownbytheIDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore.

    4.1.2.2 NoFractionMarks:TheTLDsbeingidentifiers,fractionmarkerspresentinBrahmibasedlanguagessuchasgivenbelowwillnotbeincluded.

    Figure2:FractionMarksinOriya

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    11

    4.1.2.3 NoSymbolsandAbbreviations:Abbreviations,weightsandmeasuresandothersuchiconiccharacterslikeIsshar"୰"(U+0B70)willnotbeincluded.

    4.1.2.4 NoRareandObsoleteCharacters:TherearecharacterswhichhavebeenaddedtoUnicodetoaccommodaterareformsespeciallylikeOriyaLETTERVOCALICRR"ୠ"(U+0B60)andOriyaLETTERVOCALICLL"ୡ"(U+0B61)aswellastheirMatraforms)“◌ୄ“(U+0B44)and"◌ୣ"(U+0B63).Allsuchcharacterswillbeexcluded.ThisisincompliancewiththeConservatismprincipleaslaiddownintheRootZoneLGRprocedure.

    5 Repertoire

    Thissectionprovidestherelevantsectionof[MSR]applicabletotheOriyascriptonwhichOriyacodepointrepertoirefortheRootZoneLGRisbasedon.Section5.1detailsthecode-pointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobeincludedintheOriyaRootZoneLGR.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    12

    5.1 OriyaSectionofMaximalStartingRepertoire[MSR]Version3

    Figure3:OriyaCodePagefromMSR-3

    Colorconvention:Allcharactersthatareincludedinthe[MSR]-Yellowbackground

    PVALIDinIDNA2008butexcludedfromthe[MSR]-Pinkishbackground

    NotPVALIDinIDNA2008-Whitebackground

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    13

    5.2 CodePointRepertoire

    ThetablebelowlistsallthecodepointsincludedintherepertoireforOriyascript.Foreachofthecodepoints,languagereferenceshavebeengiveninthelastcolumn.

    Sr.No.

    UnicodeCodePoint

    Glyph CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    References

    1 0B01 ◌ଁ ORIYASIGNCANDRABINDU

    2-Oriya Candrabindu [0],[101],[102],[103],[104],[105]

    2 0B02 ◌ଂ ORIYASIGNANUSVARA 2-Oriya Anusvara[0],[101],[102],[103],[104],[105]

    3 0B03 ଃ ORIYASIGNVISARGA 2-Oriya Visarga[0],[101],[102],[103],[104],[105]

    4 0B05 ଅ ORIYALETTERA 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    5 0B06 ଆ ORIYALETTERAA 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    6 0B07 ଇ ORIYALETTERI 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    7 0B08 ଈ ORIYALETTERII 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    8 0B09 ଉ ORIYALETTERU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    9 0B0A ଊ ORIYALETTERUU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    10 0B0B ଋ ORIYALETTERVOCALICR 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    11 0B0F ଏ ORIYALETTERE 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    12 0B10 ଐ ORIYALETTERAI 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    14

    Sr.No.

    UnicodeCodePoint

    Glyph CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    References

    13 0B13 ଓ ORIYALETTERO 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    14 0B14 ଔ ORIYALETTERAU 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    15 0B15 କ ORIYALETTERKA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    16 0B16 ଖ ORIYALETTERKHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    17 0B17 ଗ ORIYALETTERGA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    18 0B18 ଘ ORIYALETTERGHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    19 0B19 ଙ ORIYALETTERNGA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    20 0B1A ଚ ORIYALETTERCA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    21 0B1B ଛ ORIYALETTERCHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    22 0B1C ଜ ORIYALETTERJA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    23 0B1D ଝ ORIYALETTERJHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    24 0B1E ଞ ORIYALETTERNYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    25 0B1F ଟ ORIYALETTERTTA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    26 0B20 ଠ ORIYALETTERTTHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    27 0B21 ଡ ORIYALETTERDDA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    15

    Sr.No.

    UnicodeCodePoint

    Glyph CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    References

    28 0B22 ଢ ORIYALETTERDDHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    29 0B23 ଣ ORIYALETTERNNA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    30 0B24 ତ ORIYALETTERTA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    31 0B25 ଥ ORIYALETTERTHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    32 0B26 ଦ ORIYALETTERDA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    33 0B27 ଧ ORIYALETTERDHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    34 0B28 ନ ORIYALETTERNA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    35 0B2A ପ ORIYALETTERPA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    36 0B2B ଫ ORIYALETTERPHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    37 0B2C ବ ORIYALETTERBA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    38 0B2D ଭ ORIYALETTERBHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    39 0B2E ମ ORIYALETTERMA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    40 0B2F ଯ ORIYALETTERYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    41 0B30 ର ORIYALETTERRA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    42 0B32 ଲ ORIYALETTERLA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    16

    Sr.No.

    UnicodeCodePoint

    Glyph CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    References

    43 0B33 ଳ ORIYALETTERLLA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    44 0B35 ଵ ORIYALETTERVA 2-Oriya Consonant[6],[101],[102],[103],[104],[105]

    45 0B36 ଶ ORIYALETTERSHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    46 0B37 ଷ ORIYALETTERSSA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    47 0B38 ସ ORIYALETTERSA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    48 0B39 ହ ORIYALETTERHA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    49 0B3C ◌଼ ORIYASIGNNUKTA 2-Oriya Nukta[0],[101],[102],[103],[104],[105]

    50 0B3E ◌ା ORIYAVOWELSIGNAA 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    51 0B3F ◌ି ORIYAVOWELSIGNI 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    52 0B40 ◌ୀ ORIYAVOWELSIGNII 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    53 0B41 ◌ୁ ORIYAVOWELSIGNU 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    54 0B42 ◌ୂ ORIYAVOWELSIGNUU 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    55 0B43 ◌ୃ ORIYAVOWELSIGNVOCALICR

    2-Oriya Matra [0],[101],[102],[103],[104],[105]

    56 0B47 e ORIYAVOWELSIGNE 2-Oriya Matra[0],[101],[102],[103],[104],[105]

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    17

    Sr.No.

    UnicodeCodePoint

    Glyph CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    References

    57 0B48 eୖ ORIYAVOWELSIGNAI 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    58 0B4B eା ORIYAVOWELSIGNO 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    59 0B4C eୗ ORIYAVOWELSIGNAU 2-Oriya Matra[0],[101],[102],[103],[104],[105]

    60 0B4D ◌୍ ORIYASIGNVIRAMA 2-Oriya Halant[0],[101],[102],[103],[104],[105]

    61 0B56 ◌ୖ ORIYAAILENGTHMARK 2-Oriya Matra[2],[101],[102],[103],[104],[105]

    62 0B5F ୟ ORIYALETTERYYA 2-Oriya Consonant[0],[101],[102],[103],[104],[105]

    63 0B71 ୱ ORIYALETTERWA 2-Oriya Consonant[6],[101],[102],[103],[104],[105]

    Table4:CodePointRepertoire

    5.2.1 CodePointsExcluded

    Sr.No.

    UnicodeCodePoint

    Glyph

    CharacterName

    LanguagewithEGIDS

    IndicSyllabicCategory

    Reference

    1 0B0C ଌ ORIYALETTERVOCALICL 2-Oriya Vowel[0],[101],[102],[103],[104],[105]

    2 0B44 ◌ୄORIYAVOWELSIGNVOCALICRR

    2-Oriya Matra[9],[101],[102][103][104][105]

    3 0B57 ୗ ORIYAAULENTHMARK

    2-Oriya Matra [0],[101],[102][103][104][105]

    Table5:CodePointExcludedfromRepertoire

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    18

    Sincethematraୗ(U+0B57)ORIYAAULENTHMARKisnotincurrentusebytheOriyaCommunity,itisdecidedbytheNBGPtoexcludeit.Also,“ଌ”U+0B0C,“ୡ”U+0B61,“◌ୢ”U+0B62and”◌ୣ”U+0B63arehardlyinuseinmoderndays.

    5.2.2 Variablesinvolved

    C →Consonant

    M →Matra

    V →Vowel

    B →Anusvara

    H →Halant/Virama

    N →Nukta

    C1 → {କ0B15,ଖ0B16,ଗ0B17,ଚ0B1A,ଜ0B1C,ଡ0B21,ଢ0B22,ଫ0B2B}

    X → Visarga

    D → Candrabindu

    6 Variants

    6.1 In-ScriptVariants

    InOriyascript,therearenocharacters/charactersequenceswhichcanbecreatedbyusingtheOriyacharacterspermittedasperthe[MSR]andlookidentical.Therearenoin-scriptvariants.

    6.2 Cross-ScriptVariants

    Across-scriptvariantlabel,alsosometimesreferredtoas"WholeLabelconfusable",isthevariantcasewhereonelabelinonescriptcanbecomposedinsuchawaythatitcanresembleanotherentirelabelinadifferentscript.

    EveryindividualLGRunderNBGPprovidesasetofcross-scriptvariantcodepointsthatitidentifieswithmembersofotherrelatedscripts.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    19

    NBGP has ensured that not only the individual characters but also most of the aksharvariationsaretakenintoconsiderationduringthecross-scriptvariantanalysisofOriyawithall theotherscriptsunderNBGP. Itwasachievedbysharinga listofmost of theaksharcombinations with all the other script teams (‘most’ is used here as all the possibleConsonant+Halant+Consonant+…casescannotbepracticallycovered.CaseofalltheOriya“Consonant+Halant+Consonant”wasincludedintheanalysis).

    Oriyascripthasasetofpossiblecross-scriptvariantsonlywiththeMalayalamscript.Caseslisted inTable6arecross-scriptvariantsbetweenOriyaandMalayalam.This followstheNBGPCross-scriptVariantinclusionpolicyavailableinAppendixD.

    ItistobenotedthatnoneofthecombinationslistedinTable6aretermedtobeequivalentsofeachothersemanticallyorotherwise.Theyareonlygroupedbecausetheyareconsideredvisuallysamebythetwoscriptcommunities.

    NBGPhasensuredthatOriyaandMalayalamLGRteamsproposeasamesetofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunications.The same set of cross-script variants (with Malayalam) is supposed to be found in theMalayalamLGRdocuments.

    VariantSet Oriya Malayalam

    CP Glyph CP Glyph

    1. 0B20 ଠ 0D20 ഠ

    Table6:VariantsetbetweenOriyascriptandMalayalamscript

    ThecaseslistedinAppendixBarethevisuallyconfusablecodepointsforreference,buttheyarenotdefinedasvariantcodepoints.

    7 WholeLabelEvaluationRules(WLE)

    ThissectionprovidesthewholelabelevaluationrulesfortextwritteninOriyascript.TheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification.BelowarethesymbolsusedintheWLErulesforeachofthe"IndicSyllabicCategory"asmentionedinTable4:CodepointrepertoireInaddition,afewadditionalsymbolsdefinetheappropriatesubsetsforvariousrules.C → ConsonantM →Matra

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    20

    V →VowelB →AnusvaraH →Halant/ViramaN →NuktaC1 → {କ 0B15KA,ଖ0B16KHA,ଗ0B17GA,ଚ0B1ACA,ଜ 0B1C JA,ଡ0B21DDA,ଢ0B22DDHA, ଫ0B2BPHA}X → VisargaD → CandrabinduRule1:N(◌଼)mustbeprecededonlybyC1Forexample:ଡ(0B21)+଼(0B3C)=ଡ଼ଢ(0B22)+଼(0B3C)=ଢ଼Rule2:B( ◌ଂ)mustbeprecededbyV,C,NorMi) BmaybeprecededbyV(examples:ଅଂଶ,)ii) BmaybeprecededbyC,(example:ସଂସାର, ବଂଶ)iii) BmaybeprecededbyN(example:ଡ଼ଂଗା)iv) BmaybeprecededbyM,(examples:ସିଂହ, ମାଂସ, ବିଂଶ, ସୁତରାଂ)Rule3:X(◌ଃ)mustbeprecededbyC,V,NorMi) XmaybeprecededbyC,(example:iାୟତଃ, jମଶଃ)ii) XmaybeprecededbyN(example:ଡ଼ଃ)iii) XmaybeprecededbyM,(examples:ଦୁଃଖ, ଦୁଃଖିତ)iv) XmaybeprecededbyV,(examples:ଅଃ, ଆଃ, ଇଃ, ଉଃ)commonlyusedwhenwritingSanskritorwhenthereisreligiousrequirementRule4:D(◌ଁ)mustbeprecededbyV,C,NorMi) DmaybeprecededbyV(examples:ପାଇଁ, େଯଉଁ ,ନିଆଁ)ii) DmaybeprecededbyC(example:ମୁହଁ, ପହଁରା, ନୁହଁ)iii) DmaybeprecededbyN(example:ଡ଼ଁାଶ)iv) DmaybeprecededbyM(examples:ନାହi, ନଁା, ଗଁାେର)

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    21

    Rule5:H( ◌୍ )mustbeprecededbyCorNi) HmaybeprecededbyC,(example:ଠିm, ଭୁn)ii) HmaybeprecededbyN(example:oୟୁଟି, pାଗନ)Rule6:MmustbeprecededbyCorNi) MmaybeprecededbyC(example:ମୁହଁ, ପହଁରା, ନୁହଁ)ii) MmaybeprecededbyN(example:ଡ଼ାଇମିଟର)

    8 Contributors

    ThisproposalispreparedandsubmittedbyMr.KuldeepPatnaik(Freelancer)andreviewedbyDr.DebashishyaJethyOriya(Odia)linguisticanalyst,translatingmedicalsciencetoOdia,coauthorofabook(Odiaequivalentsofscientificterms)fromCentralInstitutionofIndianLanguages,Mysore.

    FollowingNBGPmembershelpedMr.KuldeepPatnaiktotakecrucialdecisionwhileworkingtogetherfornineIndianlanguagesincludingOriya(Odia).

    Position Name Organization Country LanguageExpertise

    Co-Chair AjayData DataXgenTechnologies India Hindi,EnglishCo-Chair MaheshD.Kulkarni C-DAC India Marathi,HindiCo-Chair UdayaNarayana

    SinghVisva-Bharati,Santiniketan,WestBengal

    India Bengali,Maithili,Hindi,English

    Member AkshatS.Joshi C-DAC India Hindi,MarathiMember AtiurRahmanKhan C-DAC India BanglaMember DrDebasishyaJethy Oriya(Odia)linguistic

    analystIndia Odia

    Member JayPaudyal Consultant India HindiMember NehaGupta C-DAC India HindiMember ShanmugamR C-DAC India TamilMember VeenaSolomon (freelancer) India Malayalam

    FollowingisthelistofotherNBGPmemberswiththeirlanguageexpertise.

    Position Name Organization Country LanguageExpertise

    Member AbhijitDutta Wikimedia India Bengali,Hindi

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    22

    Member AnivarA.Aravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India Hindi,

    BengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMARPANDA RegionalInstituteof

    Education(NCERT)India Odia

    Member BhimDhojShrestha Consultant Nepal Nepali,Newar

    Member ChitritaChatterjee InternetandMobileAssociationofIndia(IAMAI)

    India MultiplelanguagesrepresentedbymembersofIAMAI

    Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture

    India Assamese

    Member DevDassManandhar Consultant Nepal Nepali,NewarMember DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversity&

    UniversityofNorthBengalIndia Nepali

    Member GirishChandraMishra LanguageTechnologyCentre,RavenshawUniversity

    India Odia

    Member GurpreetSinghLehal PunjabiUniversityPatiala India PanjabiMember HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneurs'Hub

    (NEHUB)Nepal Nepali,

    NewarMember JijoPappachan DN.Domains India MalayalamMember K.C.Tikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeoKale Formerlyaffiliatedwith

    UniversityofPuneIndia Marathi

    Member KuldeepPatnaik(Editor) Freelancer India OdiaMember MukeshSaini EsselGroup India HindiMember N.DeivaSundaram NDSLingsoftSolutionsPvt

    LtdIndia Tamil

    Member NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiMember PawanChitrakar Gapsco Nepal Nepali

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    23

    Member PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India English,

    Hindi,Marathi,Gujarati

    Member RajibChakraborty SocietyforNaturalLanguageTechnologyResearch

    India Bangla(Bengali)

    Member RajivKumar NIXI India Member S.Maniam InternationalForumITfor

    TamilSingapore Tamil

    Member SanthoshThottingal Wikimediafoundation India Malayalam,Sourashtra,Tamil

    Member SarojaBhate UniversityofPune India SanskritMember ShambhuKumarSingh NationalTranslation

    Misson,MysoreIndia Maithili

    Member ShantaramS.WardeWalawalikar

    IndependentResearcher India Konkani

    Member ShashiPathania P.G.D.ofDogri,UniversityofJammu

    India Dogri

    Member ShubhamSaran NIXI India Member SinnathambiShanmugarajah UniversityofColombo

    SchoolofComputingSriLanka Tamil

    Member SujithKartha Digitalkz.com India MalayalamMember SurajAdhikari Mercantile

    Communications(and.npccTLD)

    Nepal Nepali

    Member SwarnaPrabhaChainary GuwahatiUniversity India BodoMember U.B.Pavanaja http://vishvakannada.com/ India KannadaMember UmaMaheshwarG CALTS,Univ.ofHyderabad India TeluguMember UttamShresthaRana NPNOG Nepal NepaliMember VinayMurarka Consultant;

    https://मेरा.भारतIndia Hindi

    Table7:ContributorsandNBGPanel

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    24

    9 References

    [MSR] IntegrationPanel,"MaximalStartingRepertoire—MSR-3OverviewandRationale",28March2018https://www.icann.org/sites/default/files/packages/lgr/msr/msr-3-wle-rules-28mar18-en.html

    [NBGP]Neo-BrahmiGenerationPanel

    [CodeCharts]TheUnicodeStandard10.0CharacterCodeChartshttp://www.unicode.org/charts/PDF/U0B00.pdf(Accessedon12January2018)

    [0] TheUnicodeStandard1.1 AnycodepointoriginallyencodedinUnicode1.1

    [6] TheUnicodeStandard4.0 AnycodepointoriginallyencodedinUnicode4.0

    [9] TheUnicodeStandard5.1 AnycodepointoriginallyencodedinUnicode5.1

    [101] Omniglot,"Oriya",https://www.omniglot.com/writing/oriya.htm(Accessedon12January2018)

    [102] Odia(Oriya)alphabet-Wikipediahttps://en.wikipedia.org/wiki/Odia_alphabet(Accessedon12January2018)

    [103] Odialanguage-Wikipediahttps://en.wikipedia.org/wiki/Odia_language(Accessedon12January2018)

    [104] Oriya(Unicodeblock)-Wikipedia,https://en.wikipedia.org/wiki/Oriya_(Unicode_block)

    [105] OdishaStateGovt.PrimarySchoolGrade1e-book“HasaKhela”:byOdishaPrimaryEducationProgrammeAuthorityhttp://opepa.odisha.gov.in/website/Download/e-Text-Book/CLass%20I/Hasa%20Khela%20Part%20II/Haso%20Khelo-II-Page-112.pdf

  • 10 AppendixA:Cross-scriptConfusableCodePoints

    Oriya script has a set of possible cross-script confusable code points with the Gujarati,Bengali,Telugu,andKannada.

    10.1 OriyaandGujarati

    Thefollowingcharactersarevisuallyconfusable.TheNBGPdiscussedandconcludedthattheyaresimilarcodepointsbutshouldnotbeconsideredasvariantcodepoints.

    Oriya Gujarati

    ◌ଃ(0B03) ◌ઃ(0A83)

    ପ(0B2A) ઘ(0A98)

    ଥ(0B25) થ(0AA5)

    Table8:ConfusablecodepointsbetweentheOriyaandGujaratiscripts

    10.2 OriyaandBengali

    Thefollowingcharactersarevisuallyconfusable.TheNBGPdiscussedandconcludedthattheyaresimilarcodepointsandshouldnotbeconsideredasvariantcodepoints.

    Bengali Oriya

    ও(0993) ଓ(0B13)

    Table9:ConfusablecodepointsbetweentheOriyaandBengaliscripts

    ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyareneithervariantcodepointsnorconfusablecodepoints.

    Bengali Oriya Resolution

    ঘ(0998) ସ(0B38) Distinguishable

    Table10:OtherresolutionsbetweenOriyascriptandBengaliscript

    10.3 OriyaandTelugu

    ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyarenotvariantcodepointsnorconfusablecodepoints

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    26

    Oriya Telugu Resolution

    ଠ(0B20) ర(0C30) Distinguishable

    ଠ(0B20) ఠ(0C20) Distinguishable

    Table11:OtherresolutionsbetweentheOriyaandTeluguscripts

    10.4 OriyaandKannada

    ThefollowingcharacterswerediscussedandtheNBGPconcludedthattheyarenotvariantcodepointsnorconfusablecodepoints

    Oriya Kannada Resolution

    ଠ(0B20) ರ(0CB0) Distinguishable

    ଠ(0B20) ಠ(0CA0) Distinguishable

    Table12:OtherresolutionsbetweentheOriyaandKannadascripts

    10.5 OriyaandMalayalam

    Thefollowingcharactersarevisuallyconfusable.TheNBGPdiscussedandconcludedthattheyaresimilarcodepointsandshouldnotbeconsideredasvariantcodepoints.

    Oriya Malayalam

    ◌ଂ(0B02) ◌ം(0D02)

    ◌ଃ (0B03) ◌ഃ(0D03)

    Table13:ConfusablecodepointsbetweentheOriyaandMalayalamscripts

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    27

    11 AppendixB:OriyaDialects

    TherearedifferentwaysofspeakingandmeaningofwordsinlocalOriyaLanguage.Howeverthescriptremainsthesame.411.1.1.1 StandardOdiaKatakiOdiaorTheOdiaofMughalbandiregionconsideredasStandardOdiaduetoliterarytraditions.ItisspokenmainlyintheeasternhalfofthestateofOdisha,withlittlevariation,indistrictslikeKhurdha,Puri,Cuttack,Jajpur,Jagatsinghpur,Kendrapada,Dhenkanal,AngulandNayagarhdistrict.11.1.1.2 Majorforms,ordialectsMidnaporiOdia:SpokenintheundividedMidnaporeDistrictofWestBengal.SinghbhumiOdia:SpokeninEastSinghbhum,WestSinghbhumandSaraikela-KharsawandistrictofJharkhandBaleswariOdia:SpokeninBaleswar,BhadrakandMayurbhanjdistrictofOdisha.GanjamiOdia:SpokeninGanjamandGajapatidistrictsofOdishaandSrikakulamdistrictofAndhraPradesh.SambalpuriOdia:SpokeninBargarh,Bolangir,Boudh,Debagarh,Jharsuguda,Kalahandi,Nuapada,SambalpurandSubarnapurdistrictsofOdishaandbysomepeopleinRaigarh,Mahasamund,RaipurdistrictsofChhattisgarhstate.DesiyaOdia:SpokeninKoraput,Rayagada,NowrangpurandMalkangiriDistrictsofOdishaandinthehillyregionsofVishakhapatnam,VizianagaramDistrictofAndhraPradesh.Bhatri:SpokeninSouth-westernOdishaandeastern-southChhattisgarh.Halbi:SpokeninundividedBastardistrictofChhattisgarh.HalbiisamixtureofOdiaandMarathiwithinfluenceofChatishgarhitriballanguages.PhulbaniOdia:

    4ExtractedfromWikipedia,https://en.wikipedia.org/wiki/Odia_language#Major_forms_or_dialects

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    28

    SpokeninPhulbani,PhulbaniTown,KhajuripadablockofKandhamal,andinnearbyareasborderingBoudhdistrict.ThislanguagegainedmomentumduringtheamalgamationofKandhamal(Phulbani),andBoudh,regionintoasingledistrictPhulabani,11.1.1.3 Minornon-literaryandtribalformsordialectsSundargadiOdia:VariationofOdiaSpokeninSundargarhdistrictofOdishaandinadjoiningpocketsofJharkhandandChhattisgarh.KalahandiaOdia:VariationofOdiaspokeninundividedKalahandiDistrictandneighboringdistrictsofChhattisgarh.Kurmi:SpokeninNorthernOdishaandSouthwestBengal.Sounti:SpokeninNorthernOdishaandSouthwestBengal.Bathudi:SpokeninNorthernOdishaandSouthwestBengal.Kondhan:AtribaldialectspokeninWesternOdisha..Laria:SpokeninborderingareasofChatishgarhandWesternOdisha.Aghria:SpokenmostlybytheingeniouspeopleofAghriacasteinWesternOdisha.Bhulia:TribalformspokeninWesternOdisha.Sadri:AmixtureofOdiaandHindilanguagewithmajorregionaltribalinfluence.BodoParja/Jharia:TribaldialectofOdiaspokenmostlyinKoraputdistrictofSouthernOdisha.Matia:TribaldialectofOdiaspokeninSouthernOdisha.Bhuyan:TribaldialectofOdiaspokeninSouthernOdisha.Reli:SpokeninSouthernOdishaandborderingareasofAndhraPradesh.Kupia:SpokenbyValmikicastepeopleintheIndianstateofTelanganaandAndhraPradesh,mostlyinHyderabad,Mahabubnagar,Srikakulam,Vizianagaram,EastGodavariandVisakhapatnamdistricts.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    29

    12 AppendixC:OriyaCharacters

    OdishaStateGovernmentPrimarySchoolGrade1e-book“HasaKhela”[105]page112listsalltheOriyacharactersasshowninFigure4.

    Figure4:OdishaStateGovt.PrimarySchoolGrade1e-book(Page112)

    13 AppendixD:NBGPCross-scriptVariantInclusionPolicy

    If,inanytwogivenscripts,allthepotentialcross-scriptvariantsconsistofdependent(e.g.VowelSigns,Anusvara,Visarga,Chandrabinduetc.)charactersONLY,thenthatentiresetcanbeignoredandnocross-scriptvariantsbeproposedbetweenthosetwoscripts.

    If,inanytwogivenscripts,thereisATLEASTONEnon-dependent(e.g.Consonant,Voweletc.)cross-scriptvariantcharacter/sequencepresent,allthepotentialcross-scriptvariantsbeconsideredandproposedbetweenthetwoscripts.Thiscross-scriptanalysishasbeenrestrictedtothescriptsthathavedescendedfromtheBrahmiasmostofthemsharesimilarusagepatterns.Byandlarge,allofthesescriptshaveacommonsetofcharactersthatexistedinBrahmiscriptandbearthesameidentities.

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    30

    However,asthescriptsbranchedoutfromtheBrahmi,dependingonvariousfactors,theshapesofthecharacterschanged.Thischangeintheshapewasnotuniformacrossallthecharactersandthescripts.Somecharactersshapesdidchangesignificantlywhereassomeofthemstillretainedsimilarity.Thecross-scriptsimilarityanalysisalsoaimstoidentifysuchcaseswherethesamecharacterretainedalmostthesameshapedespitebeingpartofthedifferentscripts.Thesesetofcharactersarevariantsofeachotherintruesensethanmerelyofco-incidentalvisualsimilarity.CaseofMalayalamandOdia(Oriya)TTHAConsonant:Thisisthecaseof"ConsonantTtha"whichhappenedtoretainthesameshapedespitebeingpartofdifferentscripts,i.e.,MalayalamandOdia.Thesecharactersare:

    ഠ - MALAYALAMLETTERTTHA(U+0D20)ଠ - ORIYALETTERTTHA(U+0B20)

    Boththecharacters, lookexactlyalikeandbelongtoa"Consonant"category.Astheyareconsonants,eachofthem,eveninthesimplestformi.e.thecharactersthemselves,arevalidlabels.AspertheNBGPcross-scriptvariantinclusionpolicy,thisisavalidcaseforinclusion.Also,eveniftheyaresinglecharacters,whenthesamecharactercombines,theoreticallytheycanforminfinite5numberofcross-scriptvariantlabelsbetweenthescriptsinvolved.Herearesomesamplesofsomeofthoselabels:

    Malayalam Oriya

    ഠഠഠU+0D20U+0D20U+0D20

    ଠଠଠU+0B20U+0B20U+0B20

    ഠഠഠഠU+0D20U+0D20U+0D20U+0D20

    ଠଠଠଠU+0B20U+0B20U+0B20U+0B20

    ഠഠഠഠഠU+0D20U+0D20U+0D20U+0D20U+0D20

    ଠଠଠଠଠU+0B20U+0B20U+0B20U+0B20U+0B20

    Since,havingsuchlabelsisarealisticpossibilityandthecorrespondinglabelslookalmostexactlyalike,NBGPhasproposedthemasblockedvariants.

    NBGPacknowledgestheconcernthatthisshapeisquitegenericandmayhaveparallelsinother scripts not under its ambit.However, asNBGPdoes not have any exposure aboutactualusageofthosecharactersinthoseparticularscripts,NBGPdesistedfromincludingthemintheanalysis.AsNBGPhasalreadyconsideredalltherelatedscriptsunderthecross-

    5Though theoretically infinite, this number would be limited to the number of such labels whose equivalent punycode string would not exceed 63 characters including the ACE prefix "xn--".

  • ProposalforanOriyaRootZoneLGR Neo-BrahmiGenerationPanel

    31

    scriptvariantanalysis,thesimilarityofthecharactersbelongingtoNBGPscriptswithotherscriptsnotundertheNBGPambit,maybeofamereco-incidentalvisualnature.

    Additionally,thisconcernisnotlimitedtothesetwocharactersbutforallthecharactersinall thescriptsunder the scopeof theRootLGRprocedure.Carryingout this analysis canpracticallybedoneonlywiththeGenerationPanelsthatexistwhiletheNBGPisactive.Thisstill leaves out those scripts out of the scope which may not have a Generation Panelestablished yet. Hence, carrying out this exercise in entirety is quite impracticable. Thisconundrum can be resolved if all the such cases are handled by the "String SimilarityAssessmentPanel"ofICANN.