proposal for a devanagari script root zone label ... · proposal for a devanagari root zone lgr...
TRANSCRIPT
ProposalforaDevanagariScriptRootZoneLabelGenerationRule-Set(LGR)LGRVersion30
Date 2019-04-22
Documentversion65
AuthorsNeo-BrahmiGenerationPanel[NBGP]
1 GeneralInformationOverviewAbstractThisdocument laysdown theLabelGenerationRuleSet for theDevanagari scriptThree
main components of the Devanagari Script LGR ie Code point repertoire Variants andWholeLabelEvaluationRuleshavebeendescribedindetailhere
All these components have been incorporated in a machine-readable format in the
accompanyingXMLfilenamedproposal-devanagari-lgr-22apr19-enxml
Inadditionadocumentnamedldquodevanagari-test-labels-22apr19-entxtrdquohasbeenprovided
ItcontainsalistofvalidandinvalidlabelsaspertheWholeLabelEvaluationlaiddowninSection 7 of this document The labels have been tagged as valid and invalid under thespecificrules1Inadditionthefilealsoliststhesetoflabelswhichcanproducevariantsas
laiddowninSection6ofthisdocument
2 ScriptforwhichtheLGRisproposedISO15924CodeDeva
ISO15924KeyNdeg315ISO15924EnglishNameDevanagari(Nagari) Latintransliterationofnativescriptnamedeacutevanacircgaricirc
Nativenameofthescriptदवनागर( MaximalStartingRepertoire[MSR]version4
1 The categorization of invalid labels under specific rules is given as per the general understanding of the LGR Tool by the NBGP During testing with any LGR tool whether a particular label gets flagged under the same rule or the different one is totally dependent on the internal implementation of the LGR Tool In case of discrepancy among the same the fact that it is an invalid label should only be considered
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
2
3 BackgroundonScriptandPrincipalLanguagesUsingItThescriptcalledNagariorDevanagari iswritten fromleft torightHistorically itderives
fromtheBrahmialphabetoftheAshokaninscriptionsDevanagariiscurrentlyusedfor11out of 22 scheduled languages of India (BoroBodo Dogri Hindi Kashmiri KonkaniMaithili Marathi Nepali Sanskrit Santali and Sindhi) and around 45 other languages
especially the related Indo-Aryan languages Bagheli Bhili Bhojpuri Himachali dialectsMagahi Newar and Rajasthani and its dialects Marwari Mewati Shekhawati BagriDhundhari Harauti and Wagdi Closely associated with Sanskrit and Prakrit it is an
alternative script for Kashmiri (by Hindu speakers) Sindhi and Santali It is growingpopular in use by speakers of tribal languages of Arunachal Pradesh Bihar ChattisgarhJharkhandMadhyaPradeshandAndamanampNicobarIslandsThescriptisalsousedinFiji
torepresentFijiHindiHindi isalsoa languageofcommunication inMauritiusMalaysiaEngland Canada South Africa Indonesia as well as emigrant communities around theworldThescriptisalsousedinNepalforwritingtheNepalilanguageNepaliistheofficial
languageofNepalaswellasone languageof thestateofSikkim in India It is spokenbyover30millionpeople
Devanagari is used by over 120 languages in India Bangladesh Nepal and in Southeast
Asia
31 TheEvolutionoftheScript
It is well known that Devanagari has evolved from the parent script Brahmi with its
earliesthistoricalformknownasAśokanBrahmitracedtothe4thcenturyBCBrahmiwas
deciphered by Sir James Prinsep in 1837 The study of Brahmi and its development hasshownthatithasgivenrisetomostofthescriptsinIndiaaswellasinothercountriesvizSriLankaMyanmarCambodiaThailandLaosandtheregionofTibettonameafew
The evolution of Brahmi into present-day Devanagari involved intermediate forms
commontootherscriptssuchasGuptaanditstwogeneratesndashSiddaṃandŚāradāinthenorthandGranthaandKadamba in theSouthDevanagaricanbesaid tohavedeveloped
fromtheKutilascriptadescendantoftheGuptascriptinturnadescendentofBrahmiTheword kutila meaning lsquocrookedrsquo was used as a descriptive term to characterize thecurvingshapesofthescriptcomparedtothestraight linesofBrahmiThis inheritanceis
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
3
thereasonwhysomeofthecharactersacrossthescriptsthatwillbeconsideredundertheNeo-BrahmiGPlooksimilartoeachotherdespitebelongingtototallydifferentcodeblocks
oftheUnicodeStandard
AlookatthedevelopmentofDevanagarifromBrahmigivesaninsightintohowtheIndic
scripts have come to be diversified the handiwork of engravers and writers who used
differenttypesofstrokesledtodifferentregionalstylesThedevelopmentofthescriptisoutlined below Figure 1 Pictorial depiction of evolution of Devanagari illustrates thestagesintheevolutionofthescript2
Period Description
300BCE MauryanEarlyBrahmiformintheAsokanedictsSomescholarsbelievethatBrahmiitselfevolvedfromKharoshthiascriptwrittenrighttoleft
200CE KushanSatavahanaDynasties
400CE GuptaDynasty
600CE Yasodharman
800CE OriginsofthepresentdayNagariScriptVardhanadynastyintheNorthandPallavaperiodintheSouth
900CE TheperiodoftheChalukyasandRashtrakutas
1100CE ContinuationoftheChalukyaRule
1300CE YadavasinthenorthandKakatiyasinthesouth
1500CE TheVijayanagarempire
Table 1 Evolution of Devanagari
2httpwwwacharyagenin8080sanskritscript_devphp
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
4
Figure 1 Pictorial depiction of evolution of Devanagari
32 Languagesconsidered
Devanagariisusedbyover120languageswhichmakesitoneofthemostusedscriptsin
the world Languages using Devanagari as their primary script belong to varying geo-politicalscenariosasgivenbelow
- designatedasofficial(scheduled)languagesofsomecountries
- usedbycommunitieslivinginurbanareas
- usedbycommunitieslivinginruralyetaccessibleareas
- usedbycommunitieslivinginfar-flungareaswhicharenoteasilyconnectedeither
byroadsorbycommunicationmechanisms
Information about official (scheduled) languages of countries is easily available
Information about languages used by communities living in urban areas is also easilyobtainable There was some effort needed to cover the languages which are spoken bycommunitieslivinginruralyetaccessibleareasHoweveritwasquitedifficulttocoverthe
restofthelanguagesbeingspokenbythecommunitieslivinginremotetribalareaswhichare generally not connected by road or by communicationmeans Defining the scope oflanguagecoveragewashenceessentialtolimitthescopeoftheworktobeundertakenfor
theanalysisoftheDevanagariLGR
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
5
NBGPdecided to employ ldquoExpandedGraded IntergenerationalDisruptionScalerdquo [EGIDS]
which is designed to measure the status of the languages of the world in terms of
endangermentordevelopmentTheEGIDSconsistsof13 levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguageNBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale
1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusageFollowingarethedescriptions3ofthosescales
Scale Label Description
1 National Thelanguageiswidelyusedbetweennationsin
tradeknowledgeexchangeandinternational
policy
2 Provincial Thelanguageisusedineducationworkmass
mediaandgovernmentatthenationallevel
3 Wider
Communication
Thelanguageisusedineducationworkmass
mediaandgovernmentwithinmajor
administrativesubdivisionsofanation
4 Educational Thelanguageisinvigoroususewith
standardizationandliteraturebeingsustained
throughawidespreadsystemofinstitutionally
supportededucation
LanguagesbelongingtoLevel5andhigherarenotinwidespreadusage
Below is the tabular representation of the languages that have been considered for the
DevanagariLGR
3httpswwwethnologuecomaboutlanguage-status
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
2
3 BackgroundonScriptandPrincipalLanguagesUsingItThescriptcalledNagariorDevanagari iswritten fromleft torightHistorically itderives
fromtheBrahmialphabetoftheAshokaninscriptionsDevanagariiscurrentlyusedfor11out of 22 scheduled languages of India (BoroBodo Dogri Hindi Kashmiri KonkaniMaithili Marathi Nepali Sanskrit Santali and Sindhi) and around 45 other languages
especially the related Indo-Aryan languages Bagheli Bhili Bhojpuri Himachali dialectsMagahi Newar and Rajasthani and its dialects Marwari Mewati Shekhawati BagriDhundhari Harauti and Wagdi Closely associated with Sanskrit and Prakrit it is an
alternative script for Kashmiri (by Hindu speakers) Sindhi and Santali It is growingpopular in use by speakers of tribal languages of Arunachal Pradesh Bihar ChattisgarhJharkhandMadhyaPradeshandAndamanampNicobarIslandsThescriptisalsousedinFiji
torepresentFijiHindiHindi isalsoa languageofcommunication inMauritiusMalaysiaEngland Canada South Africa Indonesia as well as emigrant communities around theworldThescriptisalsousedinNepalforwritingtheNepalilanguageNepaliistheofficial
languageofNepalaswellasone languageof thestateofSikkim in India It is spokenbyover30millionpeople
Devanagari is used by over 120 languages in India Bangladesh Nepal and in Southeast
Asia
31 TheEvolutionoftheScript
It is well known that Devanagari has evolved from the parent script Brahmi with its
earliesthistoricalformknownasAśokanBrahmitracedtothe4thcenturyBCBrahmiwas
deciphered by Sir James Prinsep in 1837 The study of Brahmi and its development hasshownthatithasgivenrisetomostofthescriptsinIndiaaswellasinothercountriesvizSriLankaMyanmarCambodiaThailandLaosandtheregionofTibettonameafew
The evolution of Brahmi into present-day Devanagari involved intermediate forms
commontootherscriptssuchasGuptaanditstwogeneratesndashSiddaṃandŚāradāinthenorthandGranthaandKadamba in theSouthDevanagaricanbesaid tohavedeveloped
fromtheKutilascriptadescendantoftheGuptascriptinturnadescendentofBrahmiTheword kutila meaning lsquocrookedrsquo was used as a descriptive term to characterize thecurvingshapesofthescriptcomparedtothestraight linesofBrahmiThis inheritanceis
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
3
thereasonwhysomeofthecharactersacrossthescriptsthatwillbeconsideredundertheNeo-BrahmiGPlooksimilartoeachotherdespitebelongingtototallydifferentcodeblocks
oftheUnicodeStandard
AlookatthedevelopmentofDevanagarifromBrahmigivesaninsightintohowtheIndic
scripts have come to be diversified the handiwork of engravers and writers who used
differenttypesofstrokesledtodifferentregionalstylesThedevelopmentofthescriptisoutlined below Figure 1 Pictorial depiction of evolution of Devanagari illustrates thestagesintheevolutionofthescript2
Period Description
300BCE MauryanEarlyBrahmiformintheAsokanedictsSomescholarsbelievethatBrahmiitselfevolvedfromKharoshthiascriptwrittenrighttoleft
200CE KushanSatavahanaDynasties
400CE GuptaDynasty
600CE Yasodharman
800CE OriginsofthepresentdayNagariScriptVardhanadynastyintheNorthandPallavaperiodintheSouth
900CE TheperiodoftheChalukyasandRashtrakutas
1100CE ContinuationoftheChalukyaRule
1300CE YadavasinthenorthandKakatiyasinthesouth
1500CE TheVijayanagarempire
Table 1 Evolution of Devanagari
2httpwwwacharyagenin8080sanskritscript_devphp
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
4
Figure 1 Pictorial depiction of evolution of Devanagari
32 Languagesconsidered
Devanagariisusedbyover120languageswhichmakesitoneofthemostusedscriptsin
the world Languages using Devanagari as their primary script belong to varying geo-politicalscenariosasgivenbelow
- designatedasofficial(scheduled)languagesofsomecountries
- usedbycommunitieslivinginurbanareas
- usedbycommunitieslivinginruralyetaccessibleareas
- usedbycommunitieslivinginfar-flungareaswhicharenoteasilyconnectedeither
byroadsorbycommunicationmechanisms
Information about official (scheduled) languages of countries is easily available
Information about languages used by communities living in urban areas is also easilyobtainable There was some effort needed to cover the languages which are spoken bycommunitieslivinginruralyetaccessibleareasHoweveritwasquitedifficulttocoverthe
restofthelanguagesbeingspokenbythecommunitieslivinginremotetribalareaswhichare generally not connected by road or by communicationmeans Defining the scope oflanguagecoveragewashenceessentialtolimitthescopeoftheworktobeundertakenfor
theanalysisoftheDevanagariLGR
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
5
NBGPdecided to employ ldquoExpandedGraded IntergenerationalDisruptionScalerdquo [EGIDS]
which is designed to measure the status of the languages of the world in terms of
endangermentordevelopmentTheEGIDSconsistsof13 levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguageNBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale
1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusageFollowingarethedescriptions3ofthosescales
Scale Label Description
1 National Thelanguageiswidelyusedbetweennationsin
tradeknowledgeexchangeandinternational
policy
2 Provincial Thelanguageisusedineducationworkmass
mediaandgovernmentatthenationallevel
3 Wider
Communication
Thelanguageisusedineducationworkmass
mediaandgovernmentwithinmajor
administrativesubdivisionsofanation
4 Educational Thelanguageisinvigoroususewith
standardizationandliteraturebeingsustained
throughawidespreadsystemofinstitutionally
supportededucation
LanguagesbelongingtoLevel5andhigherarenotinwidespreadusage
Below is the tabular representation of the languages that have been considered for the
DevanagariLGR
3httpswwwethnologuecomaboutlanguage-status
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
3
thereasonwhysomeofthecharactersacrossthescriptsthatwillbeconsideredundertheNeo-BrahmiGPlooksimilartoeachotherdespitebelongingtototallydifferentcodeblocks
oftheUnicodeStandard
AlookatthedevelopmentofDevanagarifromBrahmigivesaninsightintohowtheIndic
scripts have come to be diversified the handiwork of engravers and writers who used
differenttypesofstrokesledtodifferentregionalstylesThedevelopmentofthescriptisoutlined below Figure 1 Pictorial depiction of evolution of Devanagari illustrates thestagesintheevolutionofthescript2
Period Description
300BCE MauryanEarlyBrahmiformintheAsokanedictsSomescholarsbelievethatBrahmiitselfevolvedfromKharoshthiascriptwrittenrighttoleft
200CE KushanSatavahanaDynasties
400CE GuptaDynasty
600CE Yasodharman
800CE OriginsofthepresentdayNagariScriptVardhanadynastyintheNorthandPallavaperiodintheSouth
900CE TheperiodoftheChalukyasandRashtrakutas
1100CE ContinuationoftheChalukyaRule
1300CE YadavasinthenorthandKakatiyasinthesouth
1500CE TheVijayanagarempire
Table 1 Evolution of Devanagari
2httpwwwacharyagenin8080sanskritscript_devphp
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
4
Figure 1 Pictorial depiction of evolution of Devanagari
32 Languagesconsidered
Devanagariisusedbyover120languageswhichmakesitoneofthemostusedscriptsin
the world Languages using Devanagari as their primary script belong to varying geo-politicalscenariosasgivenbelow
- designatedasofficial(scheduled)languagesofsomecountries
- usedbycommunitieslivinginurbanareas
- usedbycommunitieslivinginruralyetaccessibleareas
- usedbycommunitieslivinginfar-flungareaswhicharenoteasilyconnectedeither
byroadsorbycommunicationmechanisms
Information about official (scheduled) languages of countries is easily available
Information about languages used by communities living in urban areas is also easilyobtainable There was some effort needed to cover the languages which are spoken bycommunitieslivinginruralyetaccessibleareasHoweveritwasquitedifficulttocoverthe
restofthelanguagesbeingspokenbythecommunitieslivinginremotetribalareaswhichare generally not connected by road or by communicationmeans Defining the scope oflanguagecoveragewashenceessentialtolimitthescopeoftheworktobeundertakenfor
theanalysisoftheDevanagariLGR
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
5
NBGPdecided to employ ldquoExpandedGraded IntergenerationalDisruptionScalerdquo [EGIDS]
which is designed to measure the status of the languages of the world in terms of
endangermentordevelopmentTheEGIDSconsistsof13 levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguageNBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale
1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusageFollowingarethedescriptions3ofthosescales
Scale Label Description
1 National Thelanguageiswidelyusedbetweennationsin
tradeknowledgeexchangeandinternational
policy
2 Provincial Thelanguageisusedineducationworkmass
mediaandgovernmentatthenationallevel
3 Wider
Communication
Thelanguageisusedineducationworkmass
mediaandgovernmentwithinmajor
administrativesubdivisionsofanation
4 Educational Thelanguageisinvigoroususewith
standardizationandliteraturebeingsustained
throughawidespreadsystemofinstitutionally
supportededucation
LanguagesbelongingtoLevel5andhigherarenotinwidespreadusage
Below is the tabular representation of the languages that have been considered for the
DevanagariLGR
3httpswwwethnologuecomaboutlanguage-status
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
4
Figure 1 Pictorial depiction of evolution of Devanagari
32 Languagesconsidered
Devanagariisusedbyover120languageswhichmakesitoneofthemostusedscriptsin
the world Languages using Devanagari as their primary script belong to varying geo-politicalscenariosasgivenbelow
- designatedasofficial(scheduled)languagesofsomecountries
- usedbycommunitieslivinginurbanareas
- usedbycommunitieslivinginruralyetaccessibleareas
- usedbycommunitieslivinginfar-flungareaswhicharenoteasilyconnectedeither
byroadsorbycommunicationmechanisms
Information about official (scheduled) languages of countries is easily available
Information about languages used by communities living in urban areas is also easilyobtainable There was some effort needed to cover the languages which are spoken bycommunitieslivinginruralyetaccessibleareasHoweveritwasquitedifficulttocoverthe
restofthelanguagesbeingspokenbythecommunitieslivinginremotetribalareaswhichare generally not connected by road or by communicationmeans Defining the scope oflanguagecoveragewashenceessentialtolimitthescopeoftheworktobeundertakenfor
theanalysisoftheDevanagariLGR
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
5
NBGPdecided to employ ldquoExpandedGraded IntergenerationalDisruptionScalerdquo [EGIDS]
which is designed to measure the status of the languages of the world in terms of
endangermentordevelopmentTheEGIDSconsistsof13 levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguageNBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale
1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusageFollowingarethedescriptions3ofthosescales
Scale Label Description
1 National Thelanguageiswidelyusedbetweennationsin
tradeknowledgeexchangeandinternational
policy
2 Provincial Thelanguageisusedineducationworkmass
mediaandgovernmentatthenationallevel
3 Wider
Communication
Thelanguageisusedineducationworkmass
mediaandgovernmentwithinmajor
administrativesubdivisionsofanation
4 Educational Thelanguageisinvigoroususewith
standardizationandliteraturebeingsustained
throughawidespreadsystemofinstitutionally
supportededucation
LanguagesbelongingtoLevel5andhigherarenotinwidespreadusage
Below is the tabular representation of the languages that have been considered for the
DevanagariLGR
3httpswwwethnologuecomaboutlanguage-status
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
5
NBGPdecided to employ ldquoExpandedGraded IntergenerationalDisruptionScalerdquo [EGIDS]
which is designed to measure the status of the languages of the world in terms of
endangermentordevelopmentTheEGIDSconsistsof13 levelswitheachhighernumberonthescalerepresentingagreaterlevelofdisruptiontotheintergenerationaltransmissionofthelanguageNBGPdecidedtoaccommodateallthelanguagesbelongingtoEGIDSScale
1to4foritsanalysiswhichrepresentslanguagesinoneformortheotherarestillinusageFollowingarethedescriptions3ofthosescales
Scale Label Description
1 National Thelanguageiswidelyusedbetweennationsin
tradeknowledgeexchangeandinternational
policy
2 Provincial Thelanguageisusedineducationworkmass
mediaandgovernmentatthenationallevel
3 Wider
Communication
Thelanguageisusedineducationworkmass
mediaandgovernmentwithinmajor
administrativesubdivisionsofanation
4 Educational Thelanguageisinvigoroususewith
standardizationandliteraturebeingsustained
throughawidespreadsystemofinstitutionally
supportededucation
LanguagesbelongingtoLevel5andhigherarenotinwidespreadusage
Below is the tabular representation of the languages that have been considered for the
DevanagariLGR
3httpswwwethnologuecomaboutlanguage-status
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
6
EGIDSScale1 EGIDSScale2 EGIDSScale3 EGIDSScale4
Hindi
Nepali
Konkani
Maithili
Marathi
Sindhi
Bhatri
Halbi
Kinnauri
Kukna
Panchpargania
Sadri
Wagdi
Bhojpuri
Chhattisgarhi
Dogri
Kashmiri
Limbu
Magahi
Sanskrit
Santali
TamangEastern
Avadhi
Newar
Saraiki4
Table 2 Languages considered under Devanagari LGR
DespitebeingclassifiedunderEGIDSScale5 theBoro language isalsoconsideredunder
theDevanagariLGRasitisoneofthescheduledlanguagesofIndiaandiswidelyspoken
Apartfromtheabove-mentionedlanguagesBrajDhundariMundariandKhariahavealso
been considered for the analysis as the community using themwas accessible and they
providedtheirinputs321 CaseofSanskrit
Sanskritisgenerallyperceivedasanarchaiclanguageusedonlyinancientreligioustexts
However it is worth noting that there is a quite vibrant and active user community ofSanskrit in Indiawhich practices Sanskrit on day to day basis Sanskrit is still taught in
schools under various State and Central educational boards There is increasing use ofSanskrit on socialmedia aswell The same is reflected in EGIDS scalewhere Sanskrit iscategorizedinScale4indicatingstatusofthelanguageasldquoEducationalrdquo
4 Though listed in EGIDS scale 4 Saraiki is not covered by the NBGP As per Ethnologue the Devanagari script is no longer in use by the Saraiki community Ref httpswwwethnologuecomlanguageskr
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
7
33 ThestructureofwrittenDevanagari
DevanagariisanalphasyllabaryandtheheartofthewritingsystemistheaksharItisthis
unitwhich is instinctivelyrecognizedbyusersof thescriptTounderstandthenotionofakshar abriefoverviewof thewriting system isprovided in this sectionand theakshar
itselfwill be treated in depth in Section 54 Thewriting system ofDevanagari could besummedupascomposedofthefollowing
331 TheConsonants
Devanagari consonants have an implicit schwa5 ə vowel included in them As per
traditional classification they are categorized according to their phonetic properties
(especially in terms of place plus manner of articulation) There are 5 Varga groups(classes)andonenon-VargagroupEachVargawhichcorrespondstoStopscontainsfiveconsonantsclassifiedaspertheirpropertiesThefirstfourconsonantsareclassifiedonthe
basisofvoicingandaspirationandthelastisthecorrespondingnasal
Varga Unvoiced Voiced Nasal
-Asp +Asp -Asp +Asp
Velar क U+0915
ख U+0916
ग U+0917
घ U+0918
ङ U+0919
Palatal च U+091A
छ U+091B
ज U+091C
झ U+091D
ञ U+091E
Retroflex ट U+091F
ठ U+0920
ड U+0921
ढ U+0922
ण U+0923
Dental त U+0924
थ U+0925
द U+0926
ध U+0927
न U+0928
Bi-labial प U+092A
फ U+092B
ब U+092C
भ U+092D
म U+092E
Table 3 Varga classification of consonants
Non-Varga
य U+092F
र U+0930
ल U+0932
ळ U+0933
व U+0935
श U+0936
ष U+0937
स U+0938
ह U+0939
Table 4 Non-Varga consonants
5Although representing the implicit vowel as a is more correct orthographically the schwa ə although not part of the orthographic system has been used since the a would be misunderstood and read as अआा
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
8
332 TheImplicitVowelKillerHalant6
Allconsonantscontainan implicitvowel(schwa)Aspecialsign isneededtodenote that
this implicit vowel is strippedoff This is knownas theHalant (U+094D)TheHalantthus joins two consonants and creates conjuncts which can be generally from 2 to 4consonantcombinationsInrarecasesitcanjoinupto5consonantsHoweverthenotion
ofmaximumnumber of consonants joining to formone akshar is empirical It is just anobservationdrawnfromthewordsthathavebeenobservedtodateGiventheconfluenceoflanguageshappeningintheInternetagethepossibilitythatonemaywantagenericTop
LevelDomain[gTLD]whichmayhavemorethantheobservedmaximumcannotberuledoutHenceintheLGRworkthislimitwillnotbeenforced7
333 Vowels
Separatesymbolsexist forallVowelswhicharepronounced independentlyeitherat the
beginningorafteravowelsoundToindicateaVowelsoundotherthantheimplicitoneaVowelsign(Matra)isattachedtotheconsonantSincetheconsonanthasabuilt-inschwa
thereareequivalentMatrasforallvowelsexceptingtheअ
Thecorrelationisshownasfollows
Vowel
Corresponding
vowelsign
(Matra)अ
U+0905
आ
U+0906
ा
U+093E
इ
U+0907
ि
U+093F
6 Unicode (cf Unicode 30 and above) prefers the term Virama In this report both the terms have been used to denote the character that suppresses the inherent vowel 7This can be the case when a foreign language word which admits a large number of consonants is transliterated into Devanāgarī
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
9
ई
U+0908
ी
U+0940
उ
U+0909
U+0941
ऊ
U+090A
U+0942
ऋ
U+090B
U+0943
ए
U+090F
U+0947
ऐ
U+0910
U+0948
ओ
U+0913
ो
U+094B
औ
U+0914
ौ
U+094C
ॳ
U+0973
U+093A
ॴ
U+0974
ऻ
U+093B
ऎऄ
U+090EU+0904
U+0946
ऒ
U+0912
ॊ
U+094A ऍॲ
U+090DU+0972
U+0945
ॠ
U+0960
U+0944
ऑ
U+0911
ॉ
U+0949
ॵ
U+0975
ॏ
U+094F
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
10
ॶ
U+0976
U+0956
ॷ
U+0977
U+0957
Table 5 Vowels with corresponding Matras
Marathiusesॲ(U+0972)insteadofऍ(U+090D)
334 TheAnusvara(-U+0902)
The Anusvara represents a homorganic nasal It replaces a conjunct group of a Nasal
Consonant + Halant + Consonant belonging to that particular varga Before a non-vargaconsonant the Anusvara represents a nasal sound Modern Hindi Marathi and KonkanilanguagesprefertheAnusvaratothecorrespondingHalf-nasal8
सPतvsसतsəntsaint चQपा vs चपा tʃəmpa A flower belonging to the
genusPlumeriafamilyU+0938 U+0928 U+094D U+0924 vs U+0938 U+0902 U+0924 U+091A U+092E U+094D U+092A U+093E vs U+091A U+0902 U+092A U+093E
335 NasalizationCandrabindu(-U+0901)
Candrabindu denotes nasalization of the preceding vowel as inआखatildekheye (U+0906U+0901 U+0916) Present-day Hindi users tend to replace the Candrabindu by theAnusvara
336 Nukta(-U+093C)9
TheNuktasignisplacedbelowacertainnumberofconsonantstorepresentsoundsfound
only inwords borrowed fromPerso-Arabic It is pre-dominantly used in thismanner inBodo Hindi Kashmiri Maithili Santali Sindhi and Tamang It can be adjoined to 8 A half-nasal is used in epigraphy to indicate a nasal consonant conjoined to its corresponding ldquoVargardquo through a Halant 9The possible sets of consonantsvowels have been derived from various sources viz Prior research carried out by Centre for Development of Advanced Computings [C-DAC] Graphics Intelligence based Script Technologies [GIST] Research Labs (httpscdacinindexaspxid=mlc_gist_about) Omniglot and inputs provided by various experts on-board the NBGP for specific languages Only Omniglot references have been provided as they are available online
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
11
क(U+0915) ख(U+0916) ग(U+0917)ज(U+091C) and फ(U+092B) to show thatwords having these consonants with a nukta are to be pronounced in the Perso-Arabic
styleeg
Vफ़रोज़firoz(U+092BU+093CU+093FU+0930U+094BU+091CU+093C)
Itisalsoplacedunderड(U+0921)andढ(U+0922)toindicateflappedsoundseg
बढ़ bədh(U+092CU+0922U+093C)
WebPublicationDEVANĀGARĪALPHABETANDITSROMANIZATION[109]bytheCentral
HindiDirectorateMinistryofHRDGovernmentofIndiaclearlystatessuchauseofNukta
inHindi
In Bodo the Nukta is adjoined to ड(U+0921) [110] In Maithili it is adjoined to ldquoकrdquo (U+0915)ldquoजrdquo (U+091C)ड (U+0921)andढ (U+0922)[111]InSindhiitisadjoinedtoख (U+0916) ग (U+0917) ज (U+091C)फ (U+092B) ड (U+0921) and ढ (U+0922)[104]
InKashmiri it canalsobeadjoined to च (U+091A) छ (U+091B)and ज (U+091C)[108]toindicatethelaterallyreleasedaffricates
]ायcaytea(U+091AU+093CU+093EU+092F)
^लchalwash-Imperative(U+091BU+093CU+0932)
पॊज़poacutezfact(U+092AU+094AU+091CU+093C)
NormallyaNuktaisappendedtoaConsonantHowevertheSantalilanguageusesNuktain
auniquewayTheNuktaisadjoinedtofollowingvowelsandvowelsigns
a आ (U+0906)b ओ (U+0913)c ा (U+093E)d ो (U+094B)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
12
337 Visarga(ः-U+0903)andAvagraha(ऽ-U+093D)TheVisarga is frequently used in Sanskrit and represents a sound very close to h for
exampleदःख duhkhsorrowunhappiness(U+0926U+0941U+0903U+0916)
TheAvagrahaऽ(U+093D)createsanextrastressontheprecedingvowelandisusedinSanskrit texts It is rarely used in other languages usingDevanagari In case of LGR theAvagrahaisnotpartoftherepertoireasitisbarredintheMaximalStartingRepertoire
338 ZeroWidthNon-joiner(U+200C)andZeroWidthJoiner(U+200D)The ZeroWidth Non-joiner (ZWNJ) is an invisible character used in certain cases (after
Halant) where default conjunct formation is to be explicitly restricted and the Halantjoining the two consonantsparticipating in the conjunct formationneeds tobe explicitlyshownForexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshagetsrenderedasक षndashwhenformedbyकka + (halant) + ZeroWidthNon-joiner+ ष sha In certain cases for certain communities this visual rendition creates adifferenceinthemannerinwhichthosecombinationsarepronounced
TheZeroWidthJoiner(ZWJ) isanother invisiblecharacterwhich isused incertaincases(mostlyafterHalant) inwhichaparticularconjunctcombinationgetsrenderedsuchthat
constituting consonant shapes may not be directly visible in the conjunct shape Forexampletheconjunct d ksha whichgetsformedbyकka + (halant) + षshadoesnotshowhalfformofkajoiningwithshaHoweverusingZWJtheconstitutingconsonantrsquos
shapesarepreservedinthevisualdepictione षndashformedbyकka + (halant) + ZeroWidthJoiner+षsha
Earlier the ZWJ was recommended by the Unicode Consortium to be used to generatecertainspecialconjunctslikeEyelashRa(moredetailsinSection52)HoweverwiththenewrecommendationsinplacethisusageofZWJisnownotencouraged
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
13
4 OverallDevelopmentProcessandMethodologyUnder the Neo-Brahmi Generation Panel there are many different scripts belonging to
separateUnicodeblocksEachofthesescriptshasbeenassignedaseparateLGRhowevertheNeo-BrahmiGPensuredthatthefundamentalphilosophybehindbuildingthoseLGRsareallinsyncwithallotherBrahmiderivedscriptsThisistheDevanagariLGRwhichcatersto
multiplelanguageswrittenusingDevanagarimostlybelongingtoEGIDSscale1to4
41 GuidingPrinciples
TheNBGPadoptsfollowingbroadprinciplesforselectionofcode-pointsinthecode-point
repertoireacrosstheboardforallthescriptswithinitsambit
411 Inclusionprinciples4111 Modernusage
Every character proposed should be in the everyday usage of a particular linguistic
communityCharacterswhichhavebeenencodedintheUnicodefortranscriptionpurposes
only or for archival purposes will not be considered for inclusion in the code-pointrepertoire
4112 Unambiguoususe
Every character proposed shouldhaveunambiguousunderstanding among the linguistic
communityaboutitsusageinthelanguage
412 Exclusionprinciples
ThemainexclusionprincipleisthatofExternalLimitsonScopeThesecompriseprotocols
orstandardsthatarepre-requisitestotheLabelGenerationRulesetsAllfurtherprinciplesare in fact subsumed under these limitations but have been spelt out separately for the
sakeofclarity
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
14
4121 ExternalLimitsonScope
The code point repertoire for root zone being a very special case up the ladder in the
protocolhierarchies thecanvasofavailablecharacters forselectionasapartof theRoot
Zone code point repertoire is already constrained by various protocol layers beneath itThefollowingthreemainprotocolsstandardsactassuccessivefilters
i The Unicode Standard
OutofallthecharactersthatareneededbythegivenscriptifthecharacterinquestionisnotencodedinUnicodeitcannotbeincorporatedinthecodepointrepertoireSuchcasesarequiteraregiventheelaborateandexhaustivecharacterinclusioneffortsmadebythe
Unicodeconsortium
ii IDNA Protocol
Unicode being the character-encoding standard for providing the maximum possible
representationofagivenscriptlanguageithasencodedasfaraspossibleallthepossiblecharactersneededbythescriptHoweverthedomainnamebeingaspecializedcase it isgovernedby an additional protocol knownas IDNA (InternationalizedDomainNames in
Applications)TheIDNAprotocolexcludessomecharactersoutofUnicoderepertoirefrombeingpartofthedomainnames
For Example Devanagari Letter Qa क़ (U+0958) is not allowed to be a part of domainname Itsdecomposedform ieDevanagariLetterKa followedbyDevanagariSignNuktaक(U+0915)+(U+093C)canbeusedinstead
IDNA also imposes restrictions on invisible characters ZeroWidth Non-Joiner (U+200C)
and ZeroWidth Joiner (U+200D) in the formof CONTEXTJ rules These are required in
certaincaseswhereatypicalvisualshapeofanaksharisdesired
Indomainnamesduetoabsenceofspaceldquordquoortabldquo-rdquotherewillbecaseswhereinability
to use ZWNJ can pose some issueswhere twowords need to be joined together where
previous word needs to end in an Explicit Halant and the next word begins with aconsonantInthatcaseaconjunctwillbeformedbetweenlastconsonantofthefirstword
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
15
andthe firstconsonantof thesecondword Thisvisualdisplaymaynotbedesired Forexampleiftwowordsदश(dešnation) andgवदश(videšforeignland)arejuxtaposedtoeachother the resultantword ie ldquoदिiवदशrdquo10 isnot theappropriatewayof rendering itAppropriaterenderingofthesamewouldbeldquoदश gवदशrdquowhichcanbeachievedbyaddingaZWNJinbetweenthetwowords
AstheZWNJisnotpartoftheMSRitisnotpermissibletomakesuchcombinationsIfand
when the ZWNJ is permitted by theMSR the then NBGPmay consider adding it to the
Devanagarirepertoireifnecessary
HowevertheremaynotbemuchofanimpactofexclusionoftheZWJfromMSRasthere
arebetteralternativesalreadyavailablefordepictingthecasesforwhichZWJwasearlier
usedSomespecific shapes11maynotbeable tobemadehowever therewillnotbeany
impactonthephoneticlevel
iii Maximal Starting Repertoire
The root zone LGR being a repertoire of the characters which are going to be used forcreationoftherootzoneTLDswhichinturnareanevenmorespecializedcaseofdomainnamestheRootZoneLGRProcedureintroducesadditionalexclusionsonIDNAallowedset
ofcharactersForexampletheDevanagariSignAvagrahaऽ(U+093D)evenifallowedbyIDNAprotocolisnotpermittedintherootzonerepertoireasperthe[MSR]
Tosumuptherestrictionsstartoffwithadmittingonlysuchcharactersasarepartofthe
code-block of the given scriptlanguage This is further narrowed down by the IDNAProtocolandfinallyanadditionalfilterintheformofMaximalStartingRepertoirerestrictsthecharactersetassociatedwiththegivenlanguageevenmore
10 In this particular case though it is possible to get the required display by dropping the explicit Halant at the end of the word however in that case one can argue that the pronunciation of the two words ie दश and दश is different and hence it changes the fundamental word 11 Case of d and e ष the first is composed with क++ष while the latter is with क++ZWJ+ष The pronunciation of both the conjuncts is same
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
16
4122 NoPunctuationMarks
TheTLDsbeingidentifierspunctuationmarkerspresentinBrahmibasedlanguagessuch
asDanda(U+0964)anddoubleDanda(U+0965)willnotbeincluded
4123 NoSymbolsandAbbreviations
Abbreviations weights and measures and other such iconic characters like Isshar
(U+09FA)Abbreviationsign(U+0970)etcwillnotbeincluded
4124 NoRareandObsoleteCharacters
There are characters which have been added to Unicode to accommodate rare forms
especially like DEVANAGARI LETTER VOCALIC RR ॠ (U+0960) and DEVANAGARILETTER VOCALIC LL ॡ (U+0961) as well as their Matra forms (U+0944) and (U+0963) No such characters will be included This is in compliance with theConservatismprincipleaslaiddownintheRootZoneLGRProcedure
4125 NoStressMarkersofClassicalSanskritandVedic
StressmarkersforclassicalSanskritegDEVANAGARISTRESSSIGNUDATTA(U+0951)andDEVANAGARISTRESSSIGNANUDATTA(U+0952)willnotbeincludedThisisalsoincompliancewiththeLetterprincipleaslaiddownintheRootZoneLGRprocedure
42 MethodologytoincorporatethefeedbackreceivedthroughPublicCommentprocess
TheDevanagariscriptLGRproposalwaspublishedforpubliccommenttoallowthosewhohadnotparticipatedintheNBGPtomaketheirviewsknownTheDevanagariLGRreceivedvarious comments during the public comments processMost of the comments receivedwere of the editorial nature Some comments demanded the attention to the normativesection of the document The NBGP at-large and the Devanagari team in specific wentthroughthecommentsindetailanddecidedforeachoftheindividualcommentsreceivedifitneededachangeintheoverallLGRrecommendation
WherevertheDevanagariteamdecidedthatachangewasnecessarythechangewasmadeRests of the comments were addressed by a detailed explanation about why the saidchangeisnotnecessaryAnelaboratedocumentwithallsuchexplanationswassharedwiththe NBGP at-large On overall agreement of the entire NBGP the Devanagari LGR wasfinalizedTheanalysisofpubliccommentscanbeaccessedonlinegivenat[114]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
17
5 RepertoireSection51providesthesectionofthe[MSR]applicabletotheDevanagariscriptonwhich
theDevanagaricodepointrepertoireisbasedSection52detailsthecodepointrepertoirethattheNeo-BrahmiGenerationPanel[NBGP]proposestobe includedintheDevanagariLGR51 DevanagarisectionofMaximalStartingRepertoire[MSR]Version4
Figure 2Devanagari Code Page from [MSR]
Colorconvention12
Allcharactersthatareincludedinthe[MSR]-Yellowbackground
PVALIDinIDNA2008butexcludedfromtheMSRforvariousreasons-Pinkishbackground
NotPVALIDinIDNA2008-Whitebackground
12This document needs to be printed in color for this to be read correctly
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
18
52 CodePointRepertoire
Foreachofthecodepointslanguagereferenceshavebeengiveninthelastcolumntitled
Reference For the entire coverage of Devanagari code points references of HindiMarathi Sanskrit Sindhi andKashmiri havebeen givenThoughonly five representativelanguages have been chosen for referencing they together cover all the code points
requiredforallthelanguagesthatNBGPhasconsideredasgiveninSection32
SrNo
UnicodeCodePoint
Glyph CharacterName Category
Examplelanguagesusingthecodepoint(Notexhaustive
list)
LanguagewithlowestEGIDSscaleusingthecodepoint
Reference
1 0901 DEVANAGARISIGNCANDRABINDU Candrabindu
BodoHindiKashmiriKonkaniMaithiliMarathiNepaliSantaliand
Sanskrit
1HindiNepali
[0][101][102][103][105][108][110][111][112]
[113]
2 0902 DEVANAGARISIGNANUSVARA
Anusvara(Bindu)
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
3 0903 ः DEVANAGARISIGNVISARGA Visarga
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
4 0905 अ DEVANAGARILETTERA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
5 0906 आ DEVANAGARILETTERAA Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
6 0907 इ DEVANAGARILETTERI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
7 0908 ई DEVANAGARILETTERII Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
19
8 0909 उ DEVANAGARILETTERU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
9 090A ऊ DEVANAGARILETTERUU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
10 090B ऋ DEVANAGARI
LETTERVOCALICR
Vowel HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
11 090D ऍ DEVANAGARILETTERCANDRAE Vowel Hindi 1Hindi [0][101]
12 090E ऎ DEVANAGARILETTERSHORTE Vowel Kashmiri 4Kashmiri [0][105][108]
13 090F ए DEVANAGARILETTERE Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
14 0910 ऐ DEVANAGARILETTERAI Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
15 0911 ऑ DEVANAGARI
LETTERCANDRAO
Vowel HindiKonkaniMarathiKashmiri 1Hindi
[0][100][101][102][108]
[112]
16 0912 ऒ DEVANAGARILETTERSHORTO Vowel Kashmiri 4Kashmiri [0][105][108]
17 0913 ओ DEVANAGARILETTERO Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
18 0914 औ DEVANAGARILETTERAU Vowel
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
19 0915 क DEVANAGARILETTERKA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
20
20 0916 ख DEVANAGARILETTERKHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
21 0917 ग DEVANAGARILETTERGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
22 0918 घ DEVANAGARILETTERGHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
23 0919 ङ DEVANAGARILETTERNGA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
24 091A च DEVANAGARILETTERCA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
25 091B छ DEVANAGARILETTERCHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
26 091C ज DEVANAGARILETTERJA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
27 091D झ DEVANAGARILETTERJHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
28 091E ञ DEVANAGARILETTERNYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
29 091F ट DEVANAGARILETTERTTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
21
30 0920 ठ DEVANAGARILETTERTTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
31 0921 ड DEVANAGARILETTERDDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
32 0922 ढ DEVANAGARILETTERDDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
33 0923 ण DEVANAGARILETTERNNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
34 0924 त DEVANAGARILETTERTA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
35 0925 थ DEVANAGARILETTERTHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
36 0926 द DEVANAGARILETTERDA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
37 0927 ध DEVANAGARILETTERDHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
38 0928 न DEVANAGARILETTERNA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
39 092A प DEVANAGARILETTERPA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
22
40 092B फ DEVANAGARILETTERPHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
41 092C ब DEVANAGARILETTERBA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
42 092D भ DEVANAGARILETTERBHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
43 092E म DEVANAGARILETTERMA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
44 092F य DEVANAGARILETTERYA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
45 0930 र DEVANAGARILETTERRA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
46 0932 ल DEVANAGARILETTERLA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
47 0933 ळ DEVANAGARILETTERLLA Consonant
BodoKonkaniMarathiNepali
Sanskrit1Nepali
[0][102][103][110][112]
[113]
48 0935 व DEVANAGARILETTERVA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
49 0936 श DEVANAGARILETTERSHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
23
50 0937 ष DEVANAGARILETTERSSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104]
[113]
51 0938 स DEVANAGARILETTERSA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
52 0939 ह DEVANAGARILETTERHA Consonant
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][104][105][108]
[113]
53 093A DEVANAGARI VOWEL SIGN OE Matra Kashmiri 4Kashmiri [11][105][108]
54 093B ऻ DEVANAGARI VOWEL SIGN OOE Matra Kashmiri 4Kashmiri [11][105][108]
55 093C DEVANAGARISIGNNUKTA Nukta
BodoHindiKashmiriMaithiliSantaliSindhi
1Hindi[0][101][105][108][110][109][111]
56 093E ा DEVANAGARIVOWELSIGNAA Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
57 093F ि DEVANAGARIVOWELSIGNI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
58 0940 ी DEVANAGARIVOWELSIGNII Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
59 0941 DEVANAGARIVOWELSIGNU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
60 0942 DEVANAGARIVOWELSIGNUU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
61 0943
DEVANAGARIVOWELSIGNVOCALICR
Matra HindiMarathiSanskrit 1Hindi [0][101][102]
[103]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
24
62 0945
DEVANAGARIVOWELSIGNCANDRAE=candra
MatraHindiKonkaniMarathiSanskrit
Kashmiri1Hindi [0][100][101]
[108]
63 0946
DEVANAGARIVOWELSIGNSHORTE
Matra Kashmiri 4Kashmiri [0][105][108]
64 0947 DEVANAGARIVOWELSIGNE Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
65 0948 DEVANAGARIVOWELSIGNAI Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][113]
66 0949 ॉ DEVANAGARIVOWELSIGNCANDRAO
Matra HindiKonkaniMarathiKashmiri 1Hindi [0][100][108]
67 094A ॊ DEVANAGARIVOWELSIGNSHORTO
Matra Kashmiri 4Kashmiri [0][105][108]
68 094B ो DEVANAGARIVOWELSIGNO Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
69 094C ौ DEVANAGARIVOWELSIGNAU Matra
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
70 094D DEVANAGARISIGN
VIRAMAHalantVirama
Mostofthelanguagesgivenin
section32
1HindiNepali
[0][101][102][103][105][108][113]
71 094F ॏ DEVANAGARIVOWELSIGNAW Matra Kashmiri 4Kashmiri [0][105][108]
72 0956 DEVANAGARIVOWELSIGNUE Matra Kashmiri 4Kashmiri [11][105][108]
73 0957 DEVANAGARIVOWELSIGNUUE Matra Kashmiri 4Kashmiri [11][105][108]
74 0972 ॲ
DEVANAGARILETTERCANDRA
AVowel KonkaniMarathi
Kashmiri2KonkaniMarathi
[9][100][102][108][112]
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
25
75 0973 ॳ DEVANAGARILETTEROE Vowel Kashmiri 4Kashmiri [11][105][108]
76 0974 ॴ DEVANAGARILETTEROOE Vowel Kashmiri 4Kashmiri [11][105][108]
77 0975 ॵ DEVANAGARILETTERAW Vowel Kashmiri 4Kashmiri [11][105][108]
78 0976 ॶ DEVANAGARILETTERUE Vowel Kashmiri 4Kashmiri [11][105][108]
79 0977 ॷ DEVANAGARILETTERUUE Vowel Kashmiri 4Kashmiri [11][105][108]
80 097B ॻ DEVANAGARILETTERGGA Consonant Sindhi 2Sindhi [8][104]
81 097C ॼ DEVANAGARILETTERJJA Consonant Sindhi 2Sindhi [8][104]
82 097E ॾ DEVANAGARILETTERDDDA Consonant Sindhi 2Sindhi [8][104]
83 097F ॿ DEVANAGARILETTERBBA Consonant Sindhi 2Sindhi [8][104]
Table 6 Code point repertoire
Apart from the above individual code-points the Neo-Brahmi Generation Panel alsoproposessomespecificsequenceswhichenableconditionalinclusionoftheDEVANAGARI
LETTERRRAintherepertoireforenablinginclusionofldquoEyelashRephrdquo13construct
SrNo UnicodeCodePoints Sequence CharacterNames
Examplelanguagesusingthecode-point
(Notexhaustive
list)
Reference
1
0931
094D
092F
य
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERYA
KonkaniMarathiNepali
[106][107]
13 Unicode uses the term ldquoEyelash Rardquo instead Since the construct that is formed by this sequence is a special form of Reph (which is otherwise formed by Normal Ra U+0930) the term ldquoRephrdquo is used here
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
26
2
0931
094D
0939
ह
DEVANAGARILETTERRRA
DEVANAGARISIGNVIRAMA
DEVANAGARILETTERHA
KonkaniMarathiNepali
[106][107]
Table 7 Sequences
53 CodepointsnotincludedThefollowingcodepointshavenotbeenincludedintherepertoire
SrNo
UnicodeCodePoint
Glyph CharacterName Reasonforexclusion
1 U+0904 ऄ DEVANAGARILETTERSHORTA
UsageunknownNotrequiredexplicitlybyanylanguage
2 U+090C ऌ DEVANAGARILETTERVOCALICL
NotinmodernusageExcludedasperconservatismprinciple
3 U+0929 ऩ DEVANAGARILETTERNNNANotrequiredinanyspokenlanguageRequiredonlyfor
transcribingDravidianalveolarn
4 U+0934 ऴ DEVANAGARILETTERLLLANotrequiredinanyspokenlanguageRequiredonlyfortranscribingDravidianl
5 U+0944 DEVANAGARIVOWELSIGNVOCALICRR
NotinmodernusageExcludedasperconservatismprinciple
6 U+0979 ॹ DEVANAGARILETTERZHANotrequiredinanyspokenlanguageRequiredonlyintransliterationofAvestan
7 U+097A ॺ DEVANAGARILETTERHEAVYYA
UsageunknownNotrequiredexplicitlybyanylanguage
54 StructuralFormationofDevanagari
AllthelanguageswritteninBrahmiderivedscriptsfollowaparticularwayofformationof
theirwordsknownasaksharInthenextsectiontherearedetailedaksharformationrulesas applicable to representation of the Hindi language when written in the Devanagari
ScriptThese rulesneed slight additions fordifferent languageswritten inDevanagari intermsof
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
27
-Characteradditiondeletion(egNukta[U+093C]characterisapplicableforHindi
butnotMarathi)
-Presenceorabsenceofaparticularrule(egEyelashRephconstructisrequiredin
MarathiKonkaniandNepalibutnotinHindi)
Itisworthnotingthattherulesrequiredforaccommodationofadditionallanguagesinthe
Devanagari ruleset apart from those required for Hindi are never in conflict with oneanother
In Section 7 the Whole Label Evaluation (WLE) rules are given which cover all the
languagesunderthepurviewoftheNBGPfortheDevanagariscript
55 AksharformationrulesforHindi
ThissectiondetailstheaksharformationrulesasapplicabletoHindiThefirstsectionlists
the categories of the characters in the form of variables In the rules instead of theirdescriptive names the variable names are used The second section lists four operatorsalongwiththeirfunctionswhichareassumedwhilespecifyingtherulesThefollowingtwo
sectionsdescribethetwomajorcategoriesof theakshar formations firstofwhichbeginswith the vowels and the second one with the consonants These rules are based on anIndianStandard(IS131941991)popularlyknownasIndianScriptCodeforInformation
Interchange[ISCII]551 Variablesinvolved
Dash rarrHyphen-Digit rarrIndo-Arabicdigits[0-9]
C rarrConsonantM rarrMatra
V rarrVowel
B rarrAnusvara(Bindu)D rarrCandrabindu
X rarrVisarga
H rarrHalantViramaN rarrNukta
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
28
552 Operatorsused
Symbol Function
| Alternative
[] Optional
VariableRepetition
() SequenceGroup
Table 8 Symbol functions
InwhatfollowstheVowelSequenceandtheConsonantSequencepertinenttoDevanagari
whenusedtowriteHindiaregiven
553 TheVowelSequence
AvowelsequencebeginswithavowelItmaybeoptionallyfollowedbyanAnusvara(B)
Candrabindu (D) or a Visarga (X) The number of B D or X which can follow a V in
Devanagariarerestrictedtoone
Thepossibility of aVisarga following aCandrabinduorAnusvara is ruledout since it is
usedonlyinVedicandinBengaliscript
ThevowelsequenceinHindiisthereforeV[B|D|X]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Vowel V अ a U+0905
Vowel+Anusvara V[B] अ aṁ U+0905U+0902
अ U+0905U+0902
Vowel+Candrabindu V[D] अ aṃ U+0905U+0901
अ U+0905U+0901
Vowel+Visarga V[X] अः aḥ U+0905U+0903
अ ः U+0905U+0903
Table 9
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
29
554 ConsonantSequence
Aconsonantsequencebeginswithaconsonant Itmaybeoptionally followedbyaNukta
(N)Matra(M)Anusvara(B)Candrabindu(D)Visarga(X)oraHalant(H)Thenumberofinstances of these characters occurring after a consonant is restricted to one There is apossibility of further extensionof theConsonant sequence after theNM andH Eachof
thesehasbeendiscussedinthefollowingsections
1Asingleconsonant(C)
(TheconsonantshallbetreatedascoterminouswiththeConsonantalongwiththeNukta
signwhereversuchacaseispertinent)Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant C क kaU+0915
ltsinglecharactergt
Consonant+Nukta C[N] क़ ḳa क U+0915U+093C
Table 10
2A consonant optionally followedbydependent vowel signMatra [M] orAnusvara [D]Candrabindu[B]orVisarga[X]orHalant[H]
C[M|B|D|X|H]
Examples
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra C[M] Vक ki क ि U+0915U+093F
Consonant+Anusvara C[B] क kaṁ क U+0915U+0902
Consonant+Candrabindu C[D] क kaṃ क U+0915U+0901
Consonant+Visarga C[X] कः kaḥ क ः U+0915U+0903
Consonant+Halant C[H] क k(PureConsonant)
क U+0915U+094D
Table 11
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
30
2AACMsequencecanbeoptionallyfollowedbyDBorX
(CM)[D|B|X]
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Matra+Anusvara CM[B] क kīṁ क ी
U+0915U+0940U+0902
Consonant+Matra+Candrabindu CM[D] का kāṃ क ा
U+0915U+093EU+0901
Consonant+Matra+Visarga CM[X] कः kīḥ क ी ः
U+0915U+0940U+0903 Table 12
3Asequenceofconsonants(upto4)joinedbyHalant143(CH)C
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Halant+Consonant+Halant+Consonant
CHCHCHC Pयnkrya
Pय
U+0928U+094DU+0915U+094DU+0930U+094D
U+092F Table 13
However in theWLE rules proposed in Section 7 do not impose any restriction on the
numberofconsonantsthatcanbejoinedbyaHalant
Subsets
3AThecombinationmaybefollowedbyMBDorX
Example
14 In case of Sanskrit it can join upto 5 consonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
31
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra CHC[M] eक kkīeक
U+0915U+094DU+0915U+0940
Consonant+Halant+Consonant+Anusvara CHC[B]eक
kkaṁ
eक
U+0915U+094DU+0915U+0902
Consonant+Halant+Consonant+Candrabindu CHC[D]eक
kkaṃ
eक
U+0915U+094DU+0915U+0901
Consonant+Halant+Consonant+Visarga CHC[X]eकः
kkaḥ
eकः
U+0915U+094DU+0915U+0903
Table 14
3B3(CH)CMmaybefollowedbyaBDorX
Example
SequenceDescription Sequence Example Constitutingcharacters
Consonant+Halant+Consonant+Matra+Anusvara CHCM[B] eक kkīṁ
eक
U+0915U+094DU+0915U+0940U+0902
Consonant+Halant+Consonant+Matra+Candrabindu CHCM[D] eक kkīṃ
eक
U+0915U+094DU+0915U+0940U+0901
Consonant+Halant+Consonant+Matra+Visarga CHCM[X] eकः kkīḥ
eकः
U+0915U+094DU+0915U+0940U+0903
Table 15
ThesearethebasicaksharformationrulesonwhichtheoverallDevanagariLGRisbasedAslanguagesotherthanHindiareconsideredsomeadditionallanguage-specificcharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
32
and rules are introduced There are some additional finer aspects to these rules as onetakesintoaccountthedigitspunctuationsandspecialstandalonecharacterslikeAvagraha
Thoseaspectsarenotdiscussedhereasthe[MSR]onwhichtheLGRsaresupposedtobebasedexcludesthosecharacters
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
33
6 VariantsTherearenocharacterscharactersequencesinDevanagariwhichcanbecreatedbyusing
thecharacterspermittedasperthe[MSR]andthatlookexactlyalikeHoweverDevanagarihas ample cases of confusingly similar variants TheNBGP categorizes these confusinglysimilarvariantsintwogroups
Group1Confusingduetopurevisualsimilarity
Group2 Confusing due to deviation from normally perceived character
formationsbylargerlinguisticcommunity
AsadvisedbyICANNnocasesbelongingtoGroup1areproposedasthereisanotherpanel
(Stringsimilarityassessmentpanel)entrustedtodealwithsuchcasesTable21VisuallyconfusablesinAppendixAVisuallyconfusablecharacterssequencesliststhem
CaseswhichbelongtoGroup2howeverareproposedtobeconsideredasvariantsThese
cases are not ofmere visual similarity as they involve some deviations from thewidelyaccepted norms of Devanagari akshar formations These can cause confusion even to a
carefulobserverandhencebeingproposedasvariantsFollowingisthebriefdescriptionofthesevariantsfollowedbyvariantsinTable16andTable17
61 VowelVowelsignfollowedbyNukta
The Santali language has a unique requirement for Nukta character (U+093C)positioningwhich is not common in otherDevanagari based languages Santali requirestheNuktacharactertofollowcertainVowelsandMatrasCompleterepresentationoftheseSantali combinationsnecessitated theWholeLabelEvaluationrules (given in theSection
61)tobeopenedupforthesespecificcasesAregularnon-Santaliusermostlycannotevenanticipatethepossibilityofsuchacombinationandcanconfuseitforsomethingelse
Thisgivesrisetoapossibilityofcreationofcertainlabelsthatcanbedeceptivelysimilarto
amajorityoftheDevanagariuser-baseBeingauniquecaseofhomographicsimilaritythefollowingvariantsarebeingproposed
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
34
Variant1 Variant2
आU+0906
आU+0906U+093C
ओU+0913
ओU+0913U+093C
ाU+093E
U+093EU+093C
ोU+094B
U+094BU+093C
Table 16 Proposed Variants - Set 1
611 VariantcontextruleforSantaliNuktavariants
All of the Nukta variants given in Table 16 Proposed Variants - Set 1 have a typical
characteristicwhichiswithinavariantpairVariant1isasubsetoftheVariant2egin
thefirstpairआ(U+0906)isasubsetofआ(U+0906U+093C)Thisimpliesaregenerativetendency in theory ie if anआ (U+0906) is substituted with आ (U+0906 U+093C) itintroducesanewinstanceofआ(U+0906)asseenhereinboldआ(U+0906U+093C)By
definitionthisnewcaseofआ(U+0906)mayalsoneedtobesubstitutedwithआ(U+0906U+093C) therebycreatingan invalidakshar combination आ (U+0906U+093CU+093C)whereaNuktawillneedtofollowanotherNuktaTopreventthisavariantcontextrulehas
beenaddedtoalltheabovenuktavariantsasgivenbelow
RuleAspertheTable 16 Proposed Variants - Set 1theVariant1toVariant2relationship
existsifandonlyifanyoftheVariant1setcharacterisnotfollowedbyaNukta(U+093C)
characterThusfollowingvariantrelationsareboundbytheabovecondition
आ(U+0906) rarr आ(U+0906U+093C)
ओ(U+0913) rarr ओ(U+0913U+093C)
ा(U+093E) rarr (U+093EU+093C)
ो(U+094B) rarr (U+094BU+093C)
The variant relationship fromVariant 2 toVariant 1 should be equally constrained for
two reasons First a variant is uniquely defined by both the variant mapping and the
context condition imposed on it (See [RFC 7940]) In order to maintain a symmetric
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
35
definitionofvariantsitisnecessarytodefinebothforwardandsymmetricvariantsusing
thesamecondition(Seealso[RFC8828])Secondthistypeofvariantpairisanldquoeffective
nullvariantrdquowheretheU+093CinonesequencemapstoldquonothingrdquointheotherInorderto
maintaina fully transitivesystemofvariantdefinitions it isnecessary topreventa label
likeU+0906U+093CU+093CfromhavingavariantU+0906U+093CThesamecondition
would ensure this second constraint However as sequences of U+093C followed by
U+093C are already invalid due to context rules on the Nukta the condition is only
requiredforthefirstreasoninthiscase
612 OverlappedvariantanalysisinvolvingNukta
ConsideringthefollowingvariantsetsABCDEachofthemcontains0906or093E
Set Mapping Variant Set A 093E 0901 lt--gt 0949 0902 Variant Set B 0906 0901 lt--gt 0911 0902 Variant Set C 0906 0902 lt--gt 0974 Variant Set D 093B lt--gt 093E 0902
Overlappingvariantsetsinvolving0906and093EplusNukta
Source Glyph Target Glyph Type Variant Context 0906 आ 0906 093C आ harr blocked not followed-by-N
093E ा 093E 093C ा harr blocked not followed-by-N
Whensubstitutingthevariantfromtheoverlappingsetsfor0906or093ErespectivelyintothelefthandsidesequencesofVariantSetsABCandEtheresultisavalidsequencethatdisplayswithaNuktaAsthepresenceabsenceofNuktaisthebasisforseveralvariantsetsthereforethesevariantareextendedtoensurethesymmetryandtransitivity
093E093C0901isaddedtoSetA
0906093C0901isaddedtoSetB
0906093C0902isaddedtosetCand
093E093C0902isaddedtoSetD
ForsetCtheimplicitcodepointcontextof0902and0974arenotequal0902canbefollowedbyVowelsandConsonantsonlybut0974canalsobefollowedby090109020903(Bothcanbeat
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
36
theendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)Thismatchestheintersectionbetweenthesecontexts
LikewiseforsetD0902canbefollowedbyVowelsandConsonantsonlybut093Bcanalsobefollowedby090109020903(Bothcanbeattheendofthelabel)Thereforethevariantmappingshouldreceiveacontextrulewhen(followed-by-V-C-or-end)
Theconclusionofthesevariantsetsandvariantcontextualrulesare
Set Mapping Variant Contextual Rule Variant Set A 093E 0901 lt--gt 093E 093C 0901 lt--gt 0949 0902 - Variant Set B 0906 0901 lt--gt 0906 093C 0901 lt--gt 0911 0902 - Variant Set C 0906 0902 lt--gt 0906 093C 0902 lt--gt 0974 when(followed-by-V-C-
or-end Variant Set D 093B lt--gt 093E 0902 lt--gt 093E 093C 0902 when(followed-by-V-C-
or-end)
AdditionalNotes
1 OverlappedvariantsinvolvingCandrabindu0901lt--gt09450902istechnicallyoverlappedwiththefoursetsaboveHoweverbecausethevariantcontextruleldquowhen(follows-only-C-or-N)rdquo(See641)noneofthesequencesleadtoavariantwhere0901isexpandedto09450902
2 Anotheroverlappedvariantsetwiththesefoursetsis0902(BAnusvara)lt--gt093A(MMatra)HowevertheyareallinvalidasaMatracanonlyfollowCorNwhileallsequencesincluding0902assecondelementhaveVorMasfirstelement
62 UniqueVowelsandVowelSignsrequiredforKashmiri
Kashmiriwhenwritten inDevanagari script requiresaunique setofVowels andVowel
signs which only a Kashmiri speaker can understand Themajority of Devanagari users
whoarenotconversantwithKashmiricaneasilyconfusethemwithsomeoftheVowelsVowelsignswhichlooksimilartotheKashmirionesTherearealsocaseswhereaKashmiriVowelVowelsignscanbeconfusedwithcertainaksharformationsHencetheyarebeing
proposedasvariantsVariant1 Variant2
ॳ U+0973
अ U+0905U+0902
U+093A
U+0902
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
37
ॴ U+0974
आ U+0906U+0902
ऻ U+093B
ा U+093EU+0902
ऎ U+090E
ऐ U+0910
U+0946
U+0947
ॵ U+0975
औ U+0914
ॏ U+094F
ौ U+094C
Table 17 Proposed Variants - Set 2
No variant contexts are required for these mappings but the code point sequencesintroducedasmappingwill need tobe listed in the repertoirewithmatching codepointcontextsothatbothsourceandtargetofthevariantmappinghavematchingcontexts(not
precededbyH)
63 HalantinFinalPosition(Onlyadiscussionnotproposedasvariants)
AnothercaseofdeceptivesimilaritytoamajorityoftheDevanagariuserbaseisofaword
ending inHalant (U+094D) vis-agrave-vis the samewordwithout the final Halant As thefunctionofHalant is of a vowel killer coming at the endmanyusers tend to ignore thephonetic effect of its presenceabsence The majority of users would pronounce both
words in the same way thereby creating a perception of (false) equivalence Howeverthere also exist some userswho clearly require the final Halant to achieve the peculiarphonetic effectof a truncated implicit vowel sound in theendTheseusersmakea clear
distinctionbetweenthetwowords(withandwithoutthefinalHalant)Itisforthisreasonthat the final Halant is being accommodated in the Whole Label Evaluation rules forDevanagari
In these cases the presence or absence of final Halant is clearly visible and there is no
apparentcasetomakethemvariantpairsEventuallyinthelightofpracticalexperienceafutureNBGPrevisionmayassessifthesecasesneedtobeconsideredasvariantpairs
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
38
64 VariantsbasedonCandrabinduandCandraVowelSignsfollowedbyAnusvara
This isacaseofpairsofsimilar lookingvariants involvingaCandrabindu(U+0901) InDevanagari there are twoCandravowel signs viz (U+0945) and ॉ (U+0949)whichwhen succeeded by an Anusvara (U+0902) create a shape which resembles aCandrabindu(U+0901)ThisgivesrisetopairswhichgetrenderedexactlyalikeinmanyfontsThoughsome fontscanrender themdifferently thebehavior isnotconsistentandkeepsroomforambiguityTheactualpairsofthevariantsareasfollows
Variant1 Variant2
U+0901
U+0945U+0902
ा
U+093EU+0901
ॉ
U+0949U+0902
अ
U+0905U+0901
U+0972U+0902
ए
U+090FU+0901
ऍ
U+090DU+0902
आ
U+0906U+0901
ऑ
U+0911U+0902 Table 18 Proposed Variants - Set 3
Asthefirsttwosuggestedpairsarecomposedonlyofthedependentsignstheydonotgetproperly joined in the absence of an independent character like Consonant or a Vowel
beforethesameForthesakeofclarityboththepairsprecededbyaDevanagariLetterKa(क - U+0915)areshownhere
VariantPair1क(U+0901)andक (U+0945U+0902)
VariantPair2का (U+0949U+0902)andकॉ (U+0949U+0902)
IdeallythecaseofU+0945U+0902shouldberenderedas and the case of (U+0949
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
39
U+0902)shouldberenderedas where theAnusvara is clearlyseparated fromtheCandrashapeHowevernotallfontsfollowthesameconventionandmanyofthemrender
thetwoshapesexactly likeaCandrabinduThisgivesrise toanambiguityandhencethevariantpossibility
641 VariantcontextruleforCandrabinduandCandra-Anusvaravariantpair
Thevariantpair(U+0901)-(U+0945U+0902)necessitatesthatforcreatingavariantlabeleveryCandrabindu(U+0901)inalabelbereplacedbythesequence (U+0945U+0902)ItshouldbenotedthatthebeginningofthesaidsequenceiswithaMatrasignie(U+0945)WhiletheCandrabindu(U+0901)canbeprecededbyeitherofConsonantsVowelsNuktaoraMatrathesameisnottrueforthe(U+0945)whichisaMatraAMatracanonlybeprecededbyaconsonantoraconsonantfollowedbyaNuktaTohandlethisavariantcontextrulehasbeenaddedto (U+0945U+0902)asgivenbelow
RuleAspertheTable 18 Proposed Variants - Set 3thevariantrelationshipbetweenthepair
(U+0901) - (U+0945 U+0902) exists if and only if Candrabindu (U+0901) is
precededbyaConsonantoraConsonantfollowedbyaNukta
There isnoadditionalconstraintontheprecedingcharacterof (U+0945U+0902)asthe Candrabindu can be preceded by any characterwhich can precede the sequence (U+0945 U+0902) However to express the mapping the sequence U+0945 U+0902 isformally part of the repertoire and as such must have a context rule that matches thecontext rule for U+0945 which happens to be the same context rule as for the variant
mapping For the same symmetry reason as discussed in Section611 above the formaldefinition of the variant mapping (U+0945 U+0902) -gt (U+0901) has the same variantcontextconditiontothatforthemapping(U+0901)-gt(U+0945U+0902)
65 VariantDisposition
AsvariantsmentionedinTable16Table17andTable18areconfusinglysimilaralbeitof
apeculiarnatureitisproposedthattheybeconsideredofblockednature
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
40
There is nopreference among these variantsWhichever label containing either of these
variantsischosenearliertheotherequivalentvariantlabelshouldbeblocked
66 Cross-scriptVariants
Across-scriptvariantalsosometimesreferredtoasWholeLabelvariant isthevariant
casewhereonelabelinonescriptcanbecomposedinsuchawaythatitresemblesanother
entirelabelinadifferentscript
Every individualLGRunderNBGP is supposed toprovidea setof crossscriptvariants it
identifieswithallotherscriptsunderNBGP
NBGP has ensured that not only the individual characters but also most of the akshar
variations are taken into consideration during the Cross-script variant analysis ofDevanagariwithall theother scriptsunderNBGPThiswasachievedby sharinga listof
most of the Devanagari akshar combinationswith all the other script teams (Thewordlsquomostrsquo is used here as it is not practical to cover all the possible ldquoConsonant +Halant +Consonant + helliprdquo cases However for Devanagari all cases of ldquoConsonant + Halant +
Consonantrdquocombinationswereincludedintheanalysis)
The Devanagari script has a major set of possible cross-script variants only with the
GurmukhiscriptCaseslistedinTable19areofthevariantsthatareproposedtobecross-
script variants between Devanagari and Gurmukhi Similarly Table 20 has the casesproposedtobecross-scriptvariantsbetweenDevanagariandBengali
ItistobenotedthatnoneofthecombinationslistedinTable19andTable20aretermedto
be equivalentsof eachother semanticallyorotherwiseTheyareonly groupedbasedonpossiblevisualconfusability
NBGPhasensuredthatDevanagariBengaliandGurmukhiLGRteamsproposeasameset
ofcross-scriptvariantsbymeetingface-to-faceonmanyoccasionsaswellasthroughmailcommunicationsThesamesetofcross-scriptvariants(withDevanagari)issupposedtobefoundintheBengaliandGurmukhiLGRdocuments
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
41
Devanagari Gurmukhi
U+0902
U+0A02
इU+0907
ਙU+0A19
उU+0909
ਤU+0A24
ग
U+0917
ਗU+0A17
घU+0918
ਬU+0A2C
टU+091F
ਟU+0A1F
ठU+0920
ਠU+0A20
ढU+0922
ਫU+0A2B
प
U+092A
ਧU+0A27
भU+092D
ਮU+0A2E
मU+092E
ਸU+0A38
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
42
वU+0935
ਕU+0A15
हU+0939
ਵU+0A35
U+093A
U+0A02
U+093C
U+0A3C
िU+093F
ਿU+0A3F
ीU+0940
ੀU+0A40
U+0945
U+0A71
U+0946
U+0A47
U+0946
U+0A4B
U+0947
U+0A47
U+0947
U+0A4B
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
43
U+0948
U+0A48
U+0956
U+0A41
U+0957
U+0A42
ि7ट
U+092AU+094DU+091FU+093F
ਇU+0A07
7ट8U+092AU+094DU+091FU+0940
ਈU+0A08
7टU+092AU+094DU+091FU+0947
ਏU+0A0F
7ट
092A094D091F0946 ਏ
U+0A0F
9U+0924U+094DU+0924
ਜU+0A1C
Table 19 Proposed Cross-script Devanagari-Gurmukhi Variants
Devanagari Bengali
मU+092E
মU+09AE
िU+093F
িU+09BF
Table 20 Proposed Cross-script Devanagari-Bengali Variants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
44
In addition to above cases the Devanagari and Gurmukhi scripts have a possible set ofcross-scriptconfusableswhichlooksimilarbutnotsimilarenoughtoberecommendedas
cross-scriptvariantsTheTable22DevanagariCross-scriptconfusablesinAppendixBCross-scriptConfusablesliststhem
7 WholeLabelEvaluationRules(WLE)ThissectionprovidestheWLEsthatarerequiredbyallthelanguagesmentionedinSection
32whenwritteninDevanagariScriptTheruleshavebeendraftedinsuchawaythattheycanbeeasilytranslatedintotheLGRspecification
BelowarethesymbolsusedintheWLErules foreachoftheCategoryasmentionedin
theTable6Codepointrepertoire
C rarr Consonant
M rarr Matra
V rarr Vowel
B rarr Anusvara(Bindu)
D rarr Candrabindu
X rarr Visarga
H rarr HalantVirama
N rarr Nukta
S rarr EyelashReph(C2HC3)whereC2is0931(ऱ-DEVANAGARILETTERRRA)His094D( - DEVANAGARISIGNVIRAMA)C3iseither-092F(य - DEVANAGARILETTERYA)or0939(ह - DEVANAGARILETTERHA)
BelowarethespecificWLErules
1 NmustbeprecededonlybyamemberofC1V1orM1
ThesetC1consistsoftheseconsonants
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
45
a क (U+0915)
b ख (U+0916)
c ग (U+0917)
d च (U+091A)
e छ (U+091B)
f ज (U+091C)
g ड (U+0921)
h ढ (U+0922)
i फ (U+092B)
ThesetV1consistsofthesevowels
a आ (U+0906)(RequiredinSantalilanguage)
b ओ (U+0913)(RequiredinSantalilanguage)
ThesetM1consistsofthesematras
a ा (U+093E)(RequiredinSantalilanguage)
b ो (U+094B)(RequiredinSantalilanguage)
2 HmustbeprecededbyCorCN15
3 MmustbeprecededbyCorCN16
4 XmustbeprecededbyeitherofVCNorM
5 BmustbeprecededbyeitherofVCNorM
6 DmustbeprecededbyeitherofVCNorM
7 VCanNOTbeprecededbyH(detailsinCaseofVprecededbyH)
15 where CN is a C followed by an N 16 where CN is a C followed by an N
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
46
Additional rules are used only for variants where a Nuktamaps to a null or that areoverlapped
bull VariantisnotdefinediffollowedbyaNukta(see641)
bull VariantundefinedifitisnotfollowedbyVorC(includingRRA)orendoflabel(See612)
CaseofEyelashReph
IntheWLErulesthereisnospecificmentionoftheEyelashRephfortworeasons1 AstheU+0931isaddedasapartofpermissiblesequencesinTable7Sequencesit
getspermittedonlywiththespecificsequences
2 The last characters of both the sequences of which the U+0931 is part are
consonants As the Eyelash-Reph can take all the combinations as that of a
consonantnospecifichandlingintermsofcontextruleisrequired
CaseofVprecededbyH
Asanyvalidakshar inDevanagari begins eitherwith aConsonantor aVowel in caseofmulti-words domains it was necessary to check the compatibility of both of these tosucceed any of the validakshar ending character It is to be noted that only the case ldquoVprecededbyHrdquoneedsaspecialdiscussionasgivenbelow
There couldbe cases involvingmulti-worddomainswhereVmayneed tobe allowed tofollow anH egआमअचार aːməchaːrMango pickle (U+0906 U+092EU+094DU+0905U+091AU+093EU+0930)
ThisisthecasewheretwodifferentwordsarejoinedtogetherfirstofwhichendsinanHandthesecondwordbeginswithaVSomesectionsofthelinguisticcommunityrequiretheexplicitpresenceofHforfullrepresentationofthesoundintendedHoweverbyandlarge
theformofthefirstwordwithoutanHisconsideredenoughforfullrepresentationofthesoundintendedforthefirstword
ThisisauniquesituationnecessitatedbythelackofhyphenspaceortheZeroWidthNon-
joinercharacterinthepermissiblesetofcharactersintheRootzonerepertoireOtherwiseV isneverrequiredtobeallowedtofollowanHPermittingthismaycreateaperceptivesimilarityamongtwolabels(withandwithoutH)formajorityofthelinguisticcommunity
hencethisisexplicitlyprohibitedbytheNBGP
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
47
If required in future depending on the prevailing requirements by the community the
NBGPmayconsiderrevisitingthisrule
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
48
8 ContributorsNBGPCo-chairsDrUdayaNarayanSinghMrMaheshDKulkarniandDrAjayData
FollowingisthefulllistofNBGPmemberswiththeirLanguageexpertise
Position Name Organization Country Language
ExpertiseCo-Chair AjayData DataXgenTechnologies India HindiEnglishCo-Chair MaheshDKulkarni C-DAC India MarathiHindi
EnglishCo-Chair UdayaNarayana
SinghVisva-BharatiSantiniketanWestBengal
India BengaliMaithiliHindiEnglish
Member AbhijitDutta Wikimedia India BengaliHindiMember AkshatSJoshi
(Editor)C-DAC India HindiMarathi
EnglishMember AnivarAAravind IndicProject India MalayalamMember AnupamAgrawal TataConsultancyService India HindiBengaliMember ArvindBhandari GujaratUniversity India GujaratiMember AshishModi DataXgenTechnologies India HindiMember AtiurRahmanKhan C-DAC India BanglaMember BalKrishnaBal KathmanduUniversity Nepal NepaliMember BalaramPrasain TribhuvanUniversity Nepal NepaliMember BASANTAKUMAR
PANDARegionalInstituteofEducation(NCERT)
India Odia
Member BhimDhojShrestha Consultant Nepal NepaliNewarMember ChitritaChatterjee InternetandMobileAssociationof
India(IAMAI)India Multiplelanguages
representedbymembersofIAMAI
Member DEBAJITSHARMA AnundoramBorooahInstituteofLanguageArtandCulture
India Assamese
Member DevDassManandhar
Consultant Nepal NepaliNewar
Member DhanalakshmiKT NorthernTrust India KannadaMember GaneshMurmu RanchiUniversity India SantaliMember GangadharPanday BabulFilmsSociety India TeluguMember GhanashyamNepal BenaresHinduUniversityamp
UniversityofNorthBengalIndia Nepali
Member GirishChandraMishra
LanguageTechnologyCentreRavenshawUniversity
India Odia
Member GurpreetSinghLehal
PunjabiUniversityPatiala India Panjabi
Member HarishChowdhary NIXI India HindiMember HempalShrestha NepalEntrepreneursHub
(NEHUB)Nepal NepaliNewar
Member JayPaudyal Consultant India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
49
Member JijoPappachan DNDomains India MalayalamMember KCTikayatray OdiaBhasaPratisthan India OdiaMember KalyanVasudeo
KaleFormerlyaffiliatedwithUniversityofPune
India Marathi
Member KuldeepPatnaik Visualizethysoul India OdiaMember MukeshSaini EsselGroup India HindiMember NDeivaSundaram NDSLingsoftSolutionsPvtLtd India TamilMember NehaGupta C-DAC India HindiEnglishMember NirajanParajuli NREN Nepal NepaliMember NishitJain C-DAC India HindiEnglishMember PawanChitrakar Gapsco Nepal NepaliMember PrabhakarPandey C-DAC India HindiMember PrasadPK A-onePublishers India MalayalamMember PrateekPathak ISOCMumbai India DevanagariMember RaiomondDoctor NLPConsultant India EnglishHindi
MarathiGujaratiMember RajibChakraborty SocietyforNaturalLanguage
TechnologyResearchIndia Bangla(Bengali)
Member RajivKumar NIXI India Member SManiam InternationalForumITforTamil Singapore TamilMember SanthoshThottingal Wikimediafoundation India Malayalam
SourashtraTamilMember SarojaBhate UniversityofPune India SanskritMember ShambhuKumar
SinghNationalTranslationMissionMysore
India Maithili
Member ShanmugamR C-DAC India TamilMember ShantaramSWarde
WalawalikarIndependentResearcher India Konkani
Member ShashiPathania PGDofDogriUniversityofJammu
India Dogri
Member ShubhamSaran NIXI India Member Sinnathambi
ShanmugarajahUniversityofColomboSchoolofComputing
SriLanka Tamil
Member SujithKartha Digitalkzcom India MalayalamMember SurajAdhikari MercantileCommunications(and
npccTLD)Nepal Nepali
Member SwarnaPrabhaChainary
GuwahatiUniversity India Bodo
Member UBPavanaja httpvishvakannadacom India KannadaMember UmaMaheshwarG CALTSUnivofHyderabad India TeluguMember UttamShrestha
RanaNPNOG Nepal Nepali
Member VeenaSolomon (freelancer) India MalayalamMember VinayMurarka Consultanthttpsमराभारत India Hindi
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
50
In addition following members externally gave inputs to NBGP for the respectivelanguagesscripts
Name LanguageScriptExpertiseAjitKumar AwadhiBrajLanguageAmarTumyahang LimbuLanguageAmritYonjan TamangLanguageApranaKulkarni HindiMarathiBasilBaa SadriLanguageBasilKiro KhariaLanguageBiswaLimbu LimbuLanguageDevdassManandhar NewarDevendraKumarDevesh BhojpuriLanguageDinbandhuMahto PanchparganiaLanguageDipikaSangmaNarzary BodoLanguageDrKPLekhwani SindhiDrBirendraKumarSoy MundariLanguageDrDineshKumarShrivastav MagahiLanguageDrHarvinderKaur GurmukhiScriptDrLaxmiPrasadKhatiwada NepaliLanguageHariharVaishnav HalbiIndraKumarTamang TamangLanguageJagannathSingh PanchparganiaLanguageNarendraKumarNegi KinnauriLanguagePrateekHarshwal WagdiandDhundhariLanguageRayemOlemDungdung SadriLanguageTejManAngdembe LimbuLanguageUrmilaHarshwal WagdiLanguage
9 References
[MSR]IntegrationPanelMaximalStartingRepertoiremdashMSR-4OverviewandRationale7February2019httpswwwicannorgensystemfilesfilesmsr-4-overview-25jan19-enpdf(Accessedon18thFeb2019)
[EGIDS]ExpandedGradedIntergenerationalDisruptionScalehttpswwwethnologuecomaboutlanguage-status(Accessedon13thNov2017)
[NBGP]Neo-BrahmiGenerationPanel
[RFC7940]DaviesKandAFreytagRepresentingLabelGenerationRulesetsusingXMLRFC7940August2016httpstoolsietforghtmlrfc7940(Accessedon1stDec2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
51
[RCF8228]AFreytagldquoGuidanceonDesigningLabelGenerationRulesets(LGRs)SupportingVariantLabelsAugust2017httpstoolsietforghtmlrfc8228(Accessedon1stDec2017)
[gTLD]genericTopLevelDomain
[ISCII]IndianScriptCodeforInformationInterchangehttpscdacinindexaspxid=mlc_gist_iscii(Accessedon2ndFeb2018)
[GIST]GraphicsIntelligencebasedScriptTechnologieshttpscdacinindexaspxid=gist(Accessedon2ndFeb2018)
[C-DAC]CentreforDevelopmentofAdvancedComputinghttpscdacin(Accessedon2ndFeb2018)
[0]TheUnicodeStandard11httpwwwunicodeorgversionsUnicode110(Accessedon12thDec2017)
[8]TheUnicodeStandard50httpwwwunicodeorgversionsUnicode500(Accessedon12thDec2017)
[9]TheUnicodeStandard51httpwwwunicodeorgversionsUnicode510(Accessedon12thDec2017)
[11]TheUnicodeStandard60httpwwwunicodeorgversionsUnicode600(Accessedon12thDec2017)
[100]DevanāgarīVIPTeamldquoVariantIssuesReportrdquoICANN3rdOct2011httpsarchiveicannorgentopicsnew-gtldsdevanagari-vip-issues-report-03oct11-enpdf(Accessedon10thOct2017)
[101]OmniglotHindihttpswwwomniglotcomwritinghindihtm(Accessedon10thOct2017)
[102]OmniglotMarathihttpswwwomniglotcomwritingmarathihtm(Accessedon10thOct2017)
[103]OmniglotSanskrithttpswwwomniglotcomwritingsanskrithtm(Accessedon10thOct2017)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
52
[104]OmniglotSindhihttpswwwomniglotcomwritingsindhihtm(Accessedon10thOct2017)
[105]OmniglotKashmirihttpswwwomniglotcomwritingkashmirihtm(Accessedon10thOct2017)
[106]Unicode1000SouthandCentralAsia-I-OfficialScriptsofIndiardquoPage456(R5andR5a)httpwwwunicodeorgversionsUnicode1000ch12pdf(Accessedon13thNov2017)
[107]UnicodeIndicGroupDevanagariEyelashRahttpunicodeorg~emulleriwgp8utcdochtml(Accessedon13thNov2017)
[108]MKRainaHowtoreadandwriteKashmiriinDevanagarihttpwwwkoshurorgpdfLet20Us20Learn20Kashmiripdf(Accessedon12thDec2017)
[109]CentralHindiDirectorate-MinistryofHRD-GovtofIndiaDevanāgarīAlphabetanditsRomanizationhttphindinideshalayanicinenglishhindi_orgindevnagarithesysmbolshtml(Accessedon12thDec2017
[110]OmniglotBodohttpswwwomniglotcomwritingbodohtm(Accessedon12thDec2017)
[111]OmniglotMaithilihttpswwwomniglotcomwritingmaithilihtm(Accessedon12thDec2017)
[112]OmniglotKonkanihttpswwwomniglotcomwritingkonkanihtm(Accessedon20thMay2018)
[113]OmniglotNepalihttpswwwomniglotcomwritingnepalihtm(Accessedon20thMay2018)
[114] NBGP Public comment feedback for Devanagari Gujarati Gurmukhi Script LGRProposalshttpsdocsgooglecomdocumentd1CLKdJBTNDcC_sFFs5s0a_Bk0zQUER2BIruYuyCNgkAw(Accessedon18thFeb2019)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
53
10 Booksarticlesandwebographiesconsulted
Followingisathematicallysortedsetofdocumentsbooksarticlesandwebographies
consultedinthedraftingofthisreport
101 WRITINGSYSTEMS1 DillingerDTheAlphabetAKeytotheHistoryofMankind3rdEditionin2
VolumesHutchisonLondon1968
102 DEVANĀGARĪ1 AgrawalaVS(1966)TheDevanāgarīscriptInIndianSystemsofWriting(Pp12-
16)DelhiPublicationsDivision
2 AgyeyaSacchindanandHiranandVatsyayan1972BhavantiDelhiRajpalandSons
3 BeamesJohn1872-79AComparativeGrammaroftheModernAryanLanguagesof
India3volsLondonTrubnerandCo[ReprintedbyMunshiramManoharlalNew
Delhi1966]
4 BhatiaTejK1987AHistoryoftheHindiGrammaticalTraditionHindi-Hindustani
GrammarGrammariansHistoryandProblemsLeidenNewYorkEJBrill
5 BrightW(1996)TheDevanāgarīscriptInPDanielsandWBright(eds)The
WorldrsquosWritingSystems(Pp384-390)NewYorkOxfordUniversityPress
6 CardonaGeorge1987SanskritInTheWorldsMajorLanguagesBernardComrie
(ed)LondonCroomHelm448-469
7 DwivediRamAwadh1966ACriticalSurveyofHindiLiteratureDelhiMotilal
Banarsidass
8 FaruqiShamsurRahman2001EarlyUrduLiteraryCultureandHistoryDelhi
OxfordUniversityPress
9 GuruKamtaPrasad1919HindiVyakaranVaranasiNagariPrachariniSabha
(1962edition)
10 KachruYamuna1965ATransformationalTreatmentofHindiVerbalSyntax
LondonUniversityofLondonPhDdissertation(Mimeographed)
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
54
11 KachruYamuna1966AnIntroductiontoHindiSyntaxUrbanaUniversityof
IllinoisDepartmentofLinguistics
12 KalyanKaleandAnjaliSoman1986LearningMarathiShriVishakhaPrakashan
Pune
13 McGregorRS(1977)OutlineofHindiGrammar2ndedDelhiOxfordUniversity
Press
14 McGregorRS1972OutlineofHindiGrammarwithExercisesDelhiOxford
UniversityPress
15 McGregorRS1974HindiLiteratureoftheNineteenthandEarlyTwentieth
CenturiesWiesbadenHarrassowitz
16 McGregorRS1984HindiLiteraturefromItsBeginningstotheNineteenth
CenturyWiesbadenHarrassowitz
17 PandeyPK(2007)Phonology-orthographyinterfaceinDevanāgarīforHindi
WrittenLanguageandLiteracy10(2)139-1562007
18 RaiAmrit1984AHouseDividedTheOriginandDevelopmentofHindiHindavi
DelhiOxfordUniversityPress
19 SharadOnkar1969LohiyakeVicarAllahabadLokbharatiPrakashan
20 SinghAK(2007)ProgressofmodificationofBrāhmīalphabetasrevealedbythe
inscriptionsofsixth-eighthcenturiesInPGPatelPPandeyandDRajgor(eds)
TheIndicScriptsPaleographicandLinguisticPerspectives(Pp85-107)New
DelhiDKPrintworld
21 SproatR(2000)AComputationalTheoryofWritingSystemsCambridge
UniversityPress
22 TiwariPanditUdaynarayan1961HindiBhashakaUdgamaurVikas[TheOrigin
andDevelopmentoftheHindiLanguage]PrayagLeaderPress
23 VermaMK1971TheStructureoftheNounPhraseinEnglishandHindiDelhi
MotilalBanarsidass
103 INDICCOMPUTINGSPECIFIC1 IS104018-bitcodeforinformationinterchange1982
2 IS103157-bitcodedcharactersetforinformationinterchange1985
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
55
3 IS123267-bitand8-bitcodedcharactersets-Codeextensiontechniques1987
4 ISO15919Informationanddocumentation-TransliterationofDevanāgarīand
relatedIndicscriptsintoLatincharacters2001
5 ISO2375Procedureforregistrationofescapesequences2003
6 ISO88598-bitsingle-bytecodedgraphiccharactersets-Parts1-131998-2001
7 IDNPOLICYhttpmeitygovinwritereaddatafilesIndia-IDN-Policypdf
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
56
11 AppendixAVisuallyconfusablecharacterssequencesTheTable 21 below shows characters character sequenceswhichmay appear visually
confusingtosomeoftheusersoftheDevanagariscriptHowevertheyarenotconsideredconfusingenoughtobecategorizedasvariants
Confusable1 Confusable2
कU+0915
क़U+0915U+093C
खU+0916
ख़ U+0916U+093C
गU+0917
ग़U+0917U+093C
चU+091A
]U+091AU+093C
छU+091B
^U+091BU+093C
जU+091C
ज़U+091CU+093C
डU+0921
ड़U+0921U+093C
ढU+0922
ढ़U+0922U+093C
फU+092B
फ़U+092BU+093C
Table 21 Visually confusables
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
57
12 AppendixBCross-scriptConfusablesThe Devanagari script has a major set of possible cross-script confusables with the
GurmukhiscriptTheTable22liststhem
InadditiontoGurmukhisomeinstancesofcross-scriptconfusablearefoundwithBengali
GujaratiTeluguKannadaMalayalamandSinhala
None of the combinations listed in Table 22 are considered equivalents of each other
whether semantically or otherwise They are only grouped based on possible visualconfusability
Atfirsttheymaynotlookexactlythesamehoweverinthegivencontexteginabrowser
barasapartofadomainnameorasasinglewordwherethereisnosurroundingtextfromthesamescriptfordistinguishingtheycancreatevisualconfusion
A label canbeconsidered tohaveacross-scriptvariant labelonly if all theconstituent
charactersaksharashaveanequivalentconfusableintheotherscriptIfthereisevenonesingle characterakshara which does not have an equivalent visual confusable in other
scriptitessentiallyprovidesavisualdistinctionandhenceanon-confusablestring
Devanagariconfusable Otherscriptconfusable Fromscript
ः
U+0903
ઃU+0A83 Gujarati
ः
U+0903
ః
U+0C03Telugu
ः
U+0903
ಃ
U+0C83Kannada
ः
U+0903
ഃU+0D03
Malayalam
ः
U+0903
ඃU+0A28
Sinhala
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only
Proposal for a Devanagari Root Zone LGR Neo-Brahmi Generation Panel
58
ः
U+0903
ঃ17
U+0983 Bengali
उ
U+0909
ওU+0993
Bengali
घ
U+0918
ঘU+0998
Bengali
ठ
U+0920
ਨ
U+0A28Gurmukhi
ठ
U+0920
ਰ
U+0A30Gurmukhi
ड
U+0921
ਡ
U+0A21Gurmukhi
ड
U+0921
ਤ
U+0A24Gurmukhi
ढU+0922
ਢ
U+0A22Gurmukhi
त
U+0924
ਜ
U+0A1CGurmukhi
य
U+092F
ਧ
U+0A27Gurmukhi
U+0945
U+0981
Bengali
Table 22 Devanagari Cross-script confusables 17 The Bengali and Devanagari Visarga pair was discussed at length for inclusion in normative part of the Devanagari and Bengali LGRs It was decided that both look different enough not to be included in the normative part and hence have been added in the Appendix as confusables only